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PREFACE 


The growth of interest in tests and measurements in industriaj^cflijy,'- 
cation has been rapid in recent years. The last decade has'pinducm 
numerous significant investigations in the curricular aspects ot'these 
special subjects, on which have been built newer and better materials 
and methods of teaching. A renewed interest in the possibilities of 
measurement of special aptitudes and achievement in the industrial 
education fields naturally parallels this type of development. 

This book is planned to fit into this program. It is designed to 
bring to the attention of the shop teacher, and to students in training 
for this type of work, a simple and practical discussion of the essen¬ 
tial principles of educational measurements as applied to the teaching 
of shop and drawing courses. It is based upon a considerable number 
of years of experience on our part in the teaching of courses in educa¬ 
tional measurements and in methods in the industrial arts fields. In 
addition to these major functions, this book is planned to stimulate a 
renewed interest in the more adequate evaluation of student achieve¬ 
ment by teachers of industrial education who have already had some 
experience with the work. It brings together and evaluates many of 
the more important contributions to measurements in industrial arts 
and industrial education We earnestly hope that it may also serve 
to stimulate further interest and work along these lines. 

In presenting tins material we recognize the difficulty of covering in 
an adequate manner the many difficult problems, Throughout the 
book, the aim has been to emphasize the practical rather than the 
theoretical. It is not planned to displace general treatises on meas¬ 
urements or statistics. On the other hand, it is hoped that the straight¬ 
forward presentation of the problems of measurement m this subject 
may eliminate the necessity for technical training in measurements 
and statistics in order for the student or teacher to use this book 
effectively. 

W^e wish to acknowledge our great indebtedness to the many class¬ 
room’ teachers and supervisors who have contributed directly and in¬ 
directly to the materials presented in this discussion. The kindness 
of authors and publishers who have given permission for the repro¬ 
duction of many selected portions of their work and publications is 
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likewise gratefully acknowledged. We are also indebted to Professor 
Arthur B. Mays of the Univor-sity of Illinois, who gave valuable ed¬ 
itorial criticisms; to Professor A. H. Edgerton of the University of 
Wisconsin for encouragement and valuable suggestions; to President 
Butler Laughlin of the Chicago Normal College for editorial sugges¬ 
tions; and to Professor Frank X. Henke of the Chicago Normal Col¬ 
lege for illustrative drawings. 

Chicago, III., 

May, 1935. 


L. V. Newkihk 
H. A. Geeione 
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TESTS AND MEASUREMENTS IN 
INDUSTRIAL EDUCATION 


CHAPTER I 
INTRODUCTION 

1. Significance of Measurement in Industrial Education. 

Measurement in its various forms and phases appears to be recog¬ 
nized as an integrg^l part of good classroom procedure. In no instruc¬ 
tional field is there greater need for the application of the principles 
of educational measurement than in industrial education. Industrial 
education’- teachers and supervisors need reliable measuring instru-’ 
ments in order to give more adequate educational guidance, to evaluate 
personality traits, to motivate learning, to study the effectiveness of 
teaching materials and methods, to measure pupil progress more accu-' 
rately through the establishment of more definite standards of per¬ 
formance and through the diagnosis of pupil difficulties. The use of 
tests for such purposes in other fields is a well-established practice. 
The fundamenkl principles of scientific test construction and inter¬ 
pretation may be applied to the measurement problems of the shop 
and the drafting room when modified in the light of special needs. It 
is the purpose of this book to explain and illustrate many of the 
applications and modifications of recognized principles of measure-, 
ment to these specific fields. 

The marked increase in the interest in educational measurements 
on the part of teachers of industrial education is not surprising. It 
is to be expected of a group of teachers who have had to face the many 
problems of a new and growing unit of instruction. In many ways the 
teachers of industrial education are most fortunate. They are working 

1 Throughout this text the term indvstnal education is used to include the 
general courses of the secondary school variously Imown as manual training, man¬ 
ual arts, industrial arts, and industrial arts education, and the vocational work 
of the continuation school, trade school, and evening schools. The fundamental 
measurement problems in all these courses arc similar, although the objectives of 
the courses vary from cultural to strictly vocational. 
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in a new and growing field of instruction which rapidly is becoming 
organized in the light of modern educational objectives. They have 
the advantages of all the methods and techniques that have been 
developed for measurement in other fields. They are in a position to 
utilize the good and discard the worthless results of earlier efforts. 
[A large number of accepted principles and practices for use in con¬ 
i' structing and interpreting measuring instruments are now available 
! for the evaluation of the products of teaching. Prom the standpoint 
of professional qualifications and classroom efficiency it is the indus¬ 
trial education teacher's business to understand these well-established 
principles and their .special application and use in their own fields of 
instruction. 


2. Teachers’ Marks in Industrial Education, 

The earlier studies of teachers’ marks revealed the fact that such 
measures were entirely unsuited for the evaluation of pupil achieve¬ 
ment since they were extremely subjective and quite unreliable. How¬ 
ever, these earlier studies confined themselves mainly to teachers’ 
marks used in the rating of accomplishment as it is revealed on the 
written page. It is true that many of the teachers’ marks in indus¬ 
trial education are given on this same basis, but there is also the 
matter of rating actual projects and drawings from the shop and draft¬ 
ing room as well as the manipulative skills. The absence of precise 
information on the exact subjectivity and unreliability of such marks 
in the industrial education fields prompted the authors to carry on a 
series of investigations in this field. The results are presented here as 
further evidence of the need for improved methods of measuring the 
accomplishment of students in these subjects. 

Tliiee samples from each of the fields of woodworking, drawing, 
and sheet metal were selected for study. The woodworking projects 
consisted of one gray wren-house, one red wren-house, and one rolling 
pin. The drawings were simple inked drawings, and were known as 
iiumbers 1, 6, and 7. (Sec Fig. 1.) The sheet-metal projects con¬ 
sisted of three funnels which were numbered 1, 2, and 3. 

A group of experienced industrial education teachers cooperated in 
rating these projects. The marking was done through individual or 
poup conferences with the teachers and in accordance with the follow¬ 
ing instructions: 


L Pay no attention to the name ol the maker of the piojects but rate them 
entirely on the basis of what you would consider perfect. 

perfccf'liR ^ ^00- giving 100 for perfect, 60 for half 

3. List the factors that you took into consideration in rating the projects. 
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The teachers were asked to rate the project only on the basis of what 
they considered perfect, and to give no consideration to grade stand¬ 
ards for such a project. The factor of grades was avoided, since it was 
the main purpose here to discover how much variability there is among 



Eia 1.—Samples for rating projects. 


teachers in their concept of what is good workmanship. The marks 
and rating factors obtained on beginning woodworking, drawing, and 
sheet-metal projects are given in Tables 1 to 6 inclusive. 

Results o/ Rating Woodwork Samples. Table 1 shows a range of 
40 to 81 for the ratings of the red wren-house, 70 to 96 for the gray 
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TABLE 1 

Ratings Assigned Woodwobk Samples 
(39 teachers) 


Red Wren-house Gray Wren-houao Rolling Pin 


Rating Frequency 

Mni-k 

Rating Frequency Mark 

Bating 

Frequency 

Mark 

81 

1 

C 

96 

1 

A 

98 

1 

A 

SO 

3 

C 

95 

5 

A 

06 

1 

A 

78 

4 

D 

94 

1 

A 

95 

6 

A 

75 

10 

D 

92 

3 

B 

94 

2 

A 

74 

1 

D 

90 

11 

B 

03 

2 

A 

73 

1 

D 

87 

1 

B 

02 

3 

B 

70 

3 

F 

85 

12 

G 

90 

14 

B 

05 

2 

F 

80 

2 

C 

85 

2 

C 

01 

2 

F 

75 

2 

D 

83 

1 

C 

55 

1 

F 

70 

1 

D 

80 

1 

C 

53 

1 

F 




78 

1 

D 

51 

1 

F 




75 

3 

D 

50 

0 

F 







40 

3 

F 








■wren-house, and 75 to 98 for the rolling pin, There is also a tendency 
for the ratings to bunch at certain points on the scale. The letter 
grades aho-w in a rough way that pupils with projects of the same basic 
quality might be assigned almost any of the variable passing marks, 
depending upon which shop teacher rated them. It must be remem¬ 
bered, however, that these results are not in all respects comparable 
to the actual situation in the shop or at the drawing table. In either 
of these situations the teaclier almost certainly would grade on the 
class average. Furthermore, it would be necessary to take into con¬ 
sideration the physiological development of the pupils, their intelli¬ 
gence quotients, and quite likely their mechanical aptitudes. There 
would also be the factor of the pupil’s personality and class attitude 
which might affect the teacher’s judgment. 

The data given in Table 2 show clearly that there is a wide varia¬ 
tion in the factors mentioned in scoring the same projects. There is a 
distinct preference, however, for rating factors which group themselves 
under the following headings: results of tool operations, finish, design, 
fasteners, and utility. It appears, therefore, that much of the varia¬ 
tion in the rating of these projects was due to the varying amounts of 
emphasis placed on these factors by the teachers themselves. 

Bating Drawing Samples. The ratings of the drawings (Table 3) 
show more variation and less tendency to cluster than is found in the 
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TABLE 2 

Major Factors Considered by Judoeb in Rating the Three WooDwoRKiNa 

Projects 
(39 teachers) 


Eating Factors 


Frequency 


Finish 31 

Joints 30 

Proportion 23 

Squaieness 22 

Nailing 19 

Utilitarian 13 

Commercial standard of workmanship 13 

Design 10 

Sanding 8 

Fitting 7 

Gluing 6 

Dimensions 4 

Choice of materials 4 

Planing 4 

Shape 4 

Accuracy 3 


TABLE 3 


Ratings Assigned Three Beginning Drawing Projects 


(27 teachers) 

Sample 1 

Sample 6 

Sample 7 

Rating Frequency Mark 

Rating Frequency Mark 

Hating Frequency Mark 


94 

1 

A 

88 

1 

B 

99 

1 

A 

90 

2 

B 

87 

1 

B 

98 

1 

A 

85 

3 

C 

80 

1 

C 

97 

' 1 

A 

84 

1 

C 

85 

2 

C 

95 

4 

A 

82 

1 

C 

82 

1 

C 

93 

2 

A 

80 

7 

C 

80 

2 

C 

90 

8 

B 

76 

4 

D 

78 

1 

D 

85 

6 

C 

70 

3 

D 

75 

3 

D 

80 

3 

C 

65 

1 

F 

70 

6 

F 

75 

1 

D 

60 

1 

F 

65 

3 

F 




40 

1 

F 

60 

1 

F 




25 

1 

F 

55 

2 

F 







50 

1 

F 







30 

1 

F 
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ratings of the woodworking projects. The two poorer drawings 
(Samples 1 and 6) show the most variation and the best drawing the 
least, although this drawing (sample number 7) has a variation equal 
to the range of all of the passing marks. 

The frequcneies given in Table 4. indicate that the drawing teach¬ 
ers considered the factors of lettering, figures, and lines most often, 
but they also took into consideration neatness, dimensions, erasures, 
arrowheads, and accuracy with fair consistency. On the whole, the 


TABLE 4 

Major Factors Consiuered by Judges in Rating the Three Drawings 


Rating Factors 

Ficqucncy 

Lettering and figures 

62 

Lines 

62 

Neatness 

27 

Dimensions 

22 

Erasures 

18 

Arrowheads 

16 

Accuracy 

13 

Cleanliness 

8 

French curves 

8 

Placement 

8 

Completeness 

7 

Spacing 

5 

Projection 

5 

Joints 

4 

General appearance 

4 


drawing teachers showed a slightly greater variation than the wood¬ 
working teachers in their ratings. 

Rating Sheet-Metal Projects. Differences quite typical of all such 
Things are found for the three sheet-metal projects used in the study 
(Table 5). It is apparent from Table 6 that the quality of the solder¬ 
ing was the factor considered most critically by these teachers. 

Summary of Ratings. The results of this study indicate that shop 
and drawing teachers are highly unreliable in their ratings of the same 
group of projects or drawings. Furthermore, they do not agree on the 
relative importance of the factors to be considered in making such 
ratings. There are probably not enough cases to warrant the general¬ 
ization that teachers of shop work and drawing are any more or less 
siibject^e in their markings than are teachers of the academic sub¬ 
jects, Yet the study does indicate that there is sufficient variation in 
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TABLE 5 


Ratinqs Assigned Three Sheet-Metal Projects 
(12 teachers) 



Sample 1 



Sample 2 



Sample 3 


Rating Ficquency 

Mark 

Rating Frequency Mark 

Rating Frequency 

Mark 

90 

1 

B 

95 

2 

A 

96 

1 

A 

87 

1 

B 

92 

1 

B 

90 

3 

B 

85 

2 

C 

90 

1 

B 

85 

1 

C 

80 

1 

C 

82 

1 

C 

SO 

4 

0 

77 

1 

D 

SO 

1 

C 

78 

1 

D 

75 

1 

D 

76 

5 

D 

75 

1 

D 

70 

1 

F 

70 

1 

F 

60 

1 

F 

60 

1 

F 







55 

1 

F 







50 

1 

F 







40 

1 

F 








the estimation of quality in these specimens to introduce serious errors 
in measurement based on such a procedure Table 7 summarizes the 
range in ratings and the corresponding letter marks for the nine 
projects included in the study. 


TABLE 6 

Major Eactors Considered dy Judges in Rating the Three Sheet-Metal. 

Projects 


Rating Factors Frequency 


Soldering 

33 

Proportion 

13 

Seams 

13 

Roundness 

12 

Shape 

11 

Wiring 

9 

Neatness 

8 

Accuracy 

6 

Forming 

6 

Design 

6 

Roughness 

5 

Joints 

4 

Curve 

2 

Crimping 

2 
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TABLE 7 

Summary op Batinqs 


Pioject 

Range of Ratings 

Corresponding Letter Marks 

Woodwoik 

41, 26; 23 

C-F. A-D; A-D 

Drawing 

69; 58; 24 

A-F; B-F; A-D 

Sheet metal 

50; 25; 38 

B-F, A-F; A-F 


3. Need for a Knowledge of Measurements in Industrial Education. 

Tlie rating of shop projects and drawings is difficult; it requires 
a complex fusing of judgments based on a group of variable factors. 
Yet, psychologically, the rating of shop projects and drawings is little 
different from the rating of an English theme or a paper in math¬ 
ematics. In English the factors to be considered may be spelling, 
sentence structure, paragraphing, punctuation, etc.; and in shop sub¬ 
jects the judgment may be based on such factors as tool processes, 
design, utility, finish, and fasteners. Teachers vary greatly in their 
concept of what constitutes perfection in a project or drawing. The 
same project or drawing looks different to different individuals, and 
quite probably to the same individual under different circumstances. 

Marks assigned by teachers of shop work and drawing evidently 
arc subject to the same types of errors as enter into all estimates of 
achievement. The magnitude of the error is sufficient also to warrant 
the conclusion that industrial education teachers need objective meas¬ 
urements of the results of their teaching just as much as instructors 
in any other field. The fact that numerous tests capable of securing 
valid and reliable measures of subject-matter and tool skills in this 
field have been constructed is proof that the fundamental principles 
of test construction can be successfully applied to the measurement 
of the results of teaching in shop work and drawing. The problem 
here is not to develop new principles of testing, but rather to modify 
apply, and illustrate in the industrial arts field those procedures which 
have been generally found to be sound and fundamental in other 
subjects. 

In addition to securing accurate measures of the results of teach¬ 
ing there is the equally important need on the part of teachers of 
understanding the student better, in order that they may be in a 
position to give him the proper guidance in the selection of special 
lines of training. To accomplish this, industrial education teachers 
must know the individual levels of general intelligence, special apti- 
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tudes, informational background, interests, appreciations, and emo¬ 
tional traits and attitudes of their students. 

SUMMARY EXERCISES FOE DISCUSSION 

1. What specific factors appear to make objective measurement m industrial 

education quite difficult? 

2. Rate a project m woodworking, drawing, or metal working, on a percentage 

basis, and record the characteristics of each project which influenced you 
most in assigning the marks. 

3. If you have access to a class, have each student mark independently a piojcct 

in each of the above fields, and compare the marks as to variability, follow¬ 
ing the procedure shown in Table 1. 

4. Tabulate the characteristics of each project that weie mentioned by the stu- 

ents as being considered in marking the project. 

5 In your judgment, what factors largely account for the wide variation m marks 
of achievement assigned to products of a similar quality? 

6. Suggest a number of devices which would seem to have possibilities for in¬ 
creasing accuracy in the assignment, of marks to shop projects 
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TYPES OF EDUCATIONAL TESTS 
4. Essay-Type Tests. 

The two general classifications of educational tests in common 
usage are objective and essay-type tests. Objective tests are so con¬ 
structed that they can be scored without any guessing or subjective 
judgment on the part of the user. In the traditional or essay test a 
number of questions are made out covering the material to be tested 
in a general way with statements similar to the following. 

1. Name ten coiiimou cabinet woods. 

2 How arc the grades of sandpaper indicated? 

3. WhiiL is the difference between spindle turning and face-plate turning'^ 

4 What is I'liruish? 

5, What IS the prinoplc of the inlevnal-eonibiisiion engine? 

The average teacher using the essay-type examination makes up five 
or ten questions on the subject being tested (drawing, woodwork, sheet 
metal, auto mechanics) and then allows the pupils thirty to fifty 
minutes to answer them. The directions for administering such a test 
usually consist in a statement reminding the pupils to write their 
names on each sheet before handing in the test. 

The scoring of the essay-type examination presents a real prob¬ 
lem, some phases of which were introduced in the preceding chapter. 
The teacher s principal object in giving the test is usually to secure 
an estimate of the pupil’s mastery and retention of the informational 
content of the course. In correcting an essay-type examination, 
factors appear which influence the teacher’s judgment but which have 
little to do with the actual evaluation of the student’s knowledge of 
the subject Some of these factors are English, including spelling, sen¬ 
tence structure, paragraphing, composition; mechanical features of 
the examination such as neatness, legibility, use of pen or pencil, use of 
one or both sides of the paper, kind and size of paper used; the quan¬ 
tity written; the sampling of the subjects represented by the ques¬ 
tions; the teacher’s attitude toward the pupil, or the pupil’s attitude 

10 
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toward the teacher. The final mark is influenced by unknown com¬ 
binations of these factors. This means that the mark on the test is 
an entirely inadequate expression of any one factor, and hence is an 
unreliable measure of the entire field covered by the test. Thus the 
essay test at best can furnish only the roughest measure of achieve¬ 
ment. 

It is frequently argued by those defending the essay-type test that 
it gives the student valuable training in the mechanics of writing, 
spelling, thought organization, and expression. If this were actually 
accomplished, the argument would be sound, but even an unbiased 
observer of students engaged in writing essay-type examinations must 
admit that the rush and strain of getting the words down on the exam¬ 
ination paper leaves very little opportunity for the training in thought 
organization and expression which it should give. It seems safe to 
conclude, therefore, that if a teacher desires to measure a pupil’s 
ability to spell, write, and express himself, he should use tests designed 
for that purpose and not confuse the issues. 

5. Objective Tests. 

Properly constructed objective-test exercises are not influenced ap¬ 
preciably by the conflicting factors which appear to invalidate meas¬ 
urement based on the essay-typo question. Objective exercises are 
marked by two important and related features. These are (1) brevity 
of pupil response, and (2) absence of personal judgment in scoring 
the test exercises. These features of the objective exercise make it 
equally suitable for use in the teacher-made informal examination, 
and in the more carefully constructed standardized test. 

Objective exercises are stated in such forms that the pupil is able 
to indicate his understanding by the briefest and simplest of physical 
responses, usually consisting of underlining or encircling a single word 
or phrase. Because of this brevity of pupil response, many more exer¬ 
cises may be submitted to the pupil, thus providing a more complete 
sampling or coverage of the subject-matter. The quality of the an¬ 
swers need not be evaluated by the teacher but may be scored as right 
or wrong by comparison with an answer key. The use of the objective 
form of the test exercise thus makes it possible for different teachers 
to score the same test papers and secui’e identical results. A test exor¬ 
cise which is perfectly objective may be scored repeatedly at widely 
separated intervals and by different individuals without significant 
variation in results. Such accuracy in grading test exercises can be 
obtained only when the exercises are constructed in accordance with 
certain rather well-known specifications. 
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6. Tests and Scales. 

Measuring instruments are roughly divided into ted& and scales. 
This distinction is of some value, but at times it is confusing because 
some tests resemble scales or contain certain features of scales as an 
essential part of their construction. Generally speaking, a test is a 
'measuring instrument used jor the evaluation of any knowledge, qual¬ 
ity, or ability. It may measure degree of achievement, mental ability, 
aptitude, or character traits. It may be made up of items of uniform 
difficulty, or it may be composed of a series of items of uniformly 
increasing difficulty or value. In the former case it is a rate test; in 
the latter, it is a -power test. The process of determining the diffi¬ 
culty or value of test items is called scaling. The use of this term 
possibly accounts for much of the confusion concerning tests and 
scales. 

A measuring instrument is a scale to the extent that it ranks ac¬ 
complishment directly in terms of systematic levels, grades, or ages. 
An instrument which is made up of scaled items (items of system¬ 
atically increasing difficulty) and which expresses its results in terms 
of the number of such items responded to correctly is still a test. 
Such a test is made quite often by the selection of items of known 
value from a scale. For example, a spelling test comprising words of 
gradually increasing difficulty could be made from the Simmons-Bixhr 
High School Spelling Scale (see brief description of this scale, page 99) 
by selecting the test words from columns in which a uniformly de¬ 
creasing percentage of pupils spell the words correctly. This test 
might be treated as a scale if the scores on it were expressed in terms 
of the scale value of the last word spelled correctly. If the results 
were expressed in terms of the total number of these words spelled 
correctly it would be considered a test. 

Most modern tests are really hybrids resulting from the cross¬ 
breeding of these two forms. That is, they are made up of scaled 
items, but the resulting scores representing accomplishments are ex¬ 
pressed in terms of the number of items responded to correctly. The 
specimen test shown on page 140-144 is an illustration of this type. 

In the evaluation of accomplishment in industrial education the 
quality scale has numerous uses. In general, scales of this type consist 
in a series of specimens of the particular quality under consideration 
arranged in ascending order of merit from very poor (or zero quality) 
to very high quality. The accomplishment of the individual student 
is expressed in terms of the value of the specimen on the scale most 
nearly matching his own product. Obviously, the use of such quality 
scales introduces a considerable amount of subjectivity into the meas- 
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urement, since the teacher’s judgment is necessarily involved in assign¬ 
ing the quality rating. Such scales are used for the rating of hand¬ 
writing, free-hand lettering, drawing, electrical splicing, soldering, 
wood-boring, riveting, forging, finishing, and many other products. 
The techniques used in the construction and use of these scales are 
discussed and illustrated in Chapter XII. 

7. Standardized and Informal Objective Tests. 

Objective measuring instruments are further designated as stand¬ 
ardized tests and teacher-made tests and scales. Both types are useful 
in measuring achievement in industrial education. A test is standard¬ 
ized (1) if it is composed of exercises that have been selected in the 
light of usual teaching practice and evaluated as to innate difficulty, 
and (2) if it is accompanied by norms or standards permitting the 
interpretation of results in levels of accomplishment. Standardized 
tests are of value in comparing the accomplishment of a class with 
general standards and in comparing groups in different schools in the 
same system. Teacher-made or informal objective tests are similar to 
standardized tests except that the test items are selected directly from 
the content of the course of study. Usually the items in such tests 
more closely parallel the material taught but are less carefully formu¬ 
lated and evaluated than standardized tests. Generally, too, no 
norms are available, but useful levels of accomplishment may be 
developed from year to year by recording the scores each time the 
test is given. Teachcr-made objective tests are extremely useful in 
measuring achievement and diagnosing instruction in the shop. 

From the standpoint of their administration, standardized and 
teacher-made tests may be classified as written, oral, and performance. 
Psychologically there is little difference, because, after all, they are 
all performance tests. It has been found advantageous in testing dif¬ 
ferent types of industrial education achievement to have the pupils 
write the responses to some items, respond orally to others, and in 
many cases express their knowledge by modifying material through 
the use of tools and machines. The fundamental thing to note here is 
that in order to measure scientifically it is necessary to secure a re¬ 
sponse which can be rated objectively and compared with the same 
response made by others. 

8. Classification of Educational Tests. 

Educational measuring devices may be classified according to their 
use and characteristics into achievement, diagnostic, prognostic, and 
intelligence tests. Each of these types of instruments is useful in 
measuring class and individual accomplishments, and in revealing the 
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general and special capacities of students in the industrial education 
subjects. 

Achievement Tests. Achievement tests measure abilities or prod¬ 
ucts acquired from the school or other types of educational experience 
of the pupil. Such tests may be standardized; or they may be informal 
examinations made by the teacher. Considerable attention is given in 
this book to the problems arising out of the construction, use, and 
evaluation of achievement tests. 

Diagnostic Tests. Diagnosis is really one of the major underlying 
purposes of all achievement testing. In fact, it may be said that, all 
general acliiovement tests are diagnostic to a degree. Most achieve¬ 
ment tests, however, fail to furnish adequate diagnostic information 
because of tlic large number of skills they cover and because of the 
difficulty of securing a sufficiently detailed interpretation of the results, 
Diagnostic tests are specially constructed achievement tests designed 
to discover the exact identity and location of the pupils’ strengths and 
weaknesses in subjeot-niatter mastery. The development and use of 
such tests mean, of course, that the subject-matter itself has been 
analyzed to the point that the basic or underlying skills are clearly 
identified. It is fairly safe to assume that subject-matter fields in 
which detailed diagnostic tests are not available have not yet been 
subjected to this type of analysis. 

Tests of this diagnostic or analytical character, were they avail¬ 
able, would be most useful to industrial education teachers in discov¬ 
ering what is already known by the pupil and thus indirectly in find¬ 
ing what remains to be mastered. This is really an inventory use of 
the tests. Genuine diagnostic tests have been slow to appear in indus¬ 
trial education subjects. The Newkirk-Stqddard Home Mechanics 
TesM (see page 115 for extracts from this test), though not strictly 
diagnostic, furnishes a useful analysis of instruction in home me¬ 
chanics. It may be used also to determine how well a school is teach¬ 
ing the outstanding home mechanics jobs or what jobs the individual 
pupils are best acquainted with. Hunter's Shop Tests ^ are further 
illustrations of tests with some diagnostic value, Tor example in 
this series of measures on woodwork there are tests on tools, fasten¬ 
ings, trade names, reading rules, wood finishing, and others.' 

Prognostic Tests. One of the very significant features of modern 
measurement in education is its emphasis on prediction. Tests of gen- 

1 Newkirk, L. V., and Stoddard, George D. Newkirk-Stoddard Home Mechan¬ 

ics TesL, Bureau of Educational Research and Service, State University of Iowa 
Iowa City, Iowa, 1928. ’ 

2 Hunter, Wni H., Shop Tests, The Manual Arts Press, Peoria, Illinois. 
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eral mental ability are useful to the extent that they predict a pupil’s 
general level of accomplishment. Prognostic tests are measures of 
specialized aspects of intelligence. The purpose of such tests is to 
provide the basis for accurate prediction of future achievement in 
specialized fields on the basis of present performance on some funda¬ 
mental underlying elements of the subject. Prognostic tests are de¬ 
signed to measure specific abilities underlying achievement in a par¬ 
ticular subject-matter field rather than the achievement itself. 

Aptitude or prognostic tests in industrial education should be most 
useful in determining the probability of success of a student m such 
subjects as drawing, machine shop, carpentry, bricklaying, cabinet 
making, or in any other special field. Tests of mechanical ability have 
value in predicting probable future success in industrial education 
subjects. 

Intelligence Tests. There are many definitions of intelligence, and 
many different ways and means of measuring it. In general, intelli¬ 
gence is the capacity of the individual to adapt himself to novel situa¬ 
tions. It is the power of the individual to learn. In actual practice, 
intelligence is usually measured m terms of the extent to which the 
individual has applied this power in the acquisition of information and 
skills in a number of specific and mainly unrelated fields. In a sense, 
general mental ability is like a cable composed of many strands and 
fibers of varying size and quality, each representing some particular 
phase of ability. The intelligence test is merely a device for taking 
a cross-section of this cable. If the measuring device reveals the large 
and important strands of the table it is a valid instrument. 

SUMMARY 

The two general classifications of educational tests in common 
usage are objective and essay-type tests. The objective test may be 
scored without the subjective judgment of the teacher. The grading 
of the essay type of examination presents a real problem and is influ¬ 
enced by the subjective judgment of the teacher. A test exercise 
which IS perfectly objective may be scored repeatedly at widely sep¬ 
arated intervals without significant variation in results. 

Measuring instruments are roughly divided into tests and scales, 
A test is a measuring instrument used for the evaluation of any knowl¬ 
edge, quality, or ability. A scale is a measuring instrument that ranks 
accomplishment directly in terms of systematic levels, grades, or ages. 
Objective measuring instruments are further designated as standard¬ 
ized and teacher-made tests and scales. A test is standardized when 
it is composed of exercises that have been selected in the light of usual 
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teaching practice, evaluated as to innate difficulty, and is accompa¬ 
nied by norms or standards permitting the interpretation of results in 
levels of accomplishment. 

Achievement tests measure abilities or products acquired from the 
school or other types of educational experience of the pupil. Diag¬ 
nostic tests are specially constructed achievement tests designed to 
discover the exact identity and location of the pupii’s strengths and 
weaknesses in subject-matter mastery. Prognostic tests may be 
thought of as measures of specialized aspects of intelligence. Intelli¬ 
gence or general mental ability may be described as the power the 
individual has to adapt himself to novel situations. Intelligence tests 
arc classified as group and individual, depending on the method of 
administration they employ. 

SUMMARY EXERCISES FOR DISCUSSION 

1. What special features distmguiah the objective test from the essay-type test? 

2. Enumerate as many as possible of the special factors which distinguish stand¬ 

ardized tests from informal objective tests. 

3. What does the process of standardization of a test imply? 

4. Illustrate the different types of educational tests, using materials from the 

mdustrial arts field. 

5. In what specific ways are prognostic tests different from tests of general 

mental ability? 

6. What distinguishes a test from a scale? 

7. What qualities distinguish a rate test from a power test? 

8. Suggest specific ways in which quality scales may be particularly useful in 

industrial arts classes. 

9. What types of measuring instruments seem to have the greatest possibilities 

of practical value in industrial education? 

IQ. Do you think it will ever bo possible to develop genuinely diagnostic tests 
in this field of instruction? Why? 
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CHAPTER III 


USES OF TESTS lU CLASSROOM AND SHOP 

9. Tests as Related to Instruction. 

The uses of educational tests for administrative, supervisory, re¬ 
search, and survey purposes, important as they arc, do not represent 
tlieir most vital and important functions. In the past so much em¬ 
phasis has been given to these particular uses that the teacher often 
lost sight of their real utility in the solution of his individual instruc¬ 
tional problems. The recent development of reliable, valid, and highly 
detailed measuring instruments designed to parallel closely the subject- 
matter content taught by the teacher has caused him to shift his 
point of view. He realizes now that tests are most important supple¬ 
ments to other instructional material, and that without them he can 
scarcely hope to work at his highest level of instructional efficiency, 
tie notes that modern tests are usually well made, detailed, compre¬ 
hensive, and analytical, He secs that with this type of instrument 
available it is possible for him to test as he teaches; to chart his 
instructional course from accurate and objective observations. Mod¬ 
ern tests give the teacher a chance to discover where emphasis should 
be placed, and to determine when a satisfactory level of control has 
been attained. 

The busy classroom teacher can hardly be expected to construct 
tests which will possess all the merits of a carefully constructed stand¬ 
ard test. This assumes a breadth of knowledge of the subject-matter 
and a training m the technique of test construction which most teach¬ 
ers do not have. Even if there were perfect subject-matter mastery 
and a thorough knowledge of the making of tests, it is doubtful if the 
typical classroom teacher should be expected to spend his time in this 
way when, in most fields, other, better-made, and more economical 
materials are available. Yet, the teacher should certainly not be 
forced to depend upon his general observation of his pupils for his 
information concerning their strengths and weaknesses. 

There are times and conditions, however, in which the informal 
objective or teacher-made test is very useful. The teacher-made ob¬ 
jective-type shop test serves its most useful function in measuring the 
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relative achievement of the individual members of a class. Quite often 
sLandardizcd tests contain items that are not taught in the course. 
Frequently they do not contain items which are taught. Teacher- 
made tests can be designed to fit the specific needs of the shop teach¬ 
er’s own course. Teacher-made achievement tests can be used for 
diagnosis of special dilficulties, for the motivation of learning, for 
the measurement of accomplishment and assigning shop marks; but 
since they do not have norms, the results obtained cannot be com¬ 
pared with data from other schools. However, the industrial educa¬ 
tion teacher can study his success a’s a teacher by comparing results 
from semester to semester and from year to year. 

10. Specific Uses of Tests. 

The specific applications of tests to industrial education are dis¬ 
cussed under the following general topics: 

1. The measurement of class and pupil achievement. - ‘ 

2. The establishment of standards and norms of performance. . 

3. The motivation of learning. 

4. The determination of eificiency of instruction, v' 

5. The placement and guidance of pupils, 

6. The evaluation of teaching materials and methods. 

This broad scope of usefulness indicates that measurement is a funda¬ 
mental factor in teaching industrial education subjects and that it is 
largely through the application of measurement to these subjects that 
adequate teaching methods and materials will be developed. 

11. Measurement of Class and Pupil Achievement. 

The very principles lying back of the construction of educational 
tests almost guarantee their usefulness to the classroom teacher in 
evaluating individual pupil and class accomplishment. The selection 
of the test items to cover the basic portions of the course of study 
which is or should be taught provides one basis for comparison. The 
form in which the test exercises are stated eliminates the personal 
equation of the individual teacher. The length of most of our ap¬ 
proved standard tests guarantees consistency in Lire results of their 
use. The existence of norms and standards gives definite meaning 
to their scores. It is, therefore, a relatively simple matter for the 
classroom teacher to secure an accurate measure of the accomplish¬ 
ment of his class. 

The use of a standard test in almost any selected subject makes 
possible the direct comparison of the individual pupils in the class on 
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an objective basis. The simple procedure ot determining the average 
of the scores made by the class permits a direct comparison of this 
particular class with other, comparable classes in the building or 
system. Another very useful type of comparison is one which is fre¬ 
quently made between achievement at the beginning and at the end of 
a period of instruction. Each particular type of comparison serves 
its own purpose of assisting the teacher in determining the relative at¬ 
tainment and progress of his class. 

12. Establishment of Goals of Attainment. 

The fact that a test has been put through the process known as 
standardization gives it a distinct value in the classroom which an 
informal test does not have. The establishment of standards or norms 
for a test sets up in an objective way the goals to be attained in the 
course. The determination of whether or not a wood joint, a solder 
joint, a rope splice, a wire connection, a hem, a type of stitch, a draw¬ 
ing, a sample of freehand lettering is accciitable is not necessarily a 
matter of individual teacher judgment. In many of these fields ob¬ 
jective standards on tests and scales establish these levels of attain¬ 
ment. 

The comparison of results obtained from shop or laboratory proj¬ 
ects with norms and standards for tests and scales gives the teacher an 
accurate indication of the achievement of his class in relation to other 
classes at the same experience or grade level. Experience shows that 
it is very helpful for an industrial education teacher to be able to 
evaluate his teaching success in terms of other teachers’ accomplish¬ 
ments. If the results are consistently lower and the test items cover 
the course of study in a suitable manner it indicates the need for im¬ 
proved methods on the part of the teacher, or else reveals very low 
aptitude on the part of the pupils. 

An example of test norms in the industrial education field is given 
in Table 8, which shows data on norms for the Nash-Van Dugee Wood- 
wotJo The norms (medians) show Lhe average achievement 

scores of pupils on this test according to the number of minutes of in¬ 
struction they have had. Any teacher capable of giving this test can 
compare the median of his class scores with the general average over 
the United States. In a large city the average accomplishment of dif¬ 
ferent classes in the same school system may be compared in a similar 
way. Table 8 gives norms from the Nash-Van Duzee Woodwork 
Test I, Scale B.'‘ This table shows the norms on the basis of semesters 

1 Nash, Harry B . and Van Duzee, Roy R,, Woodwork Tests, The Bruco Pub¬ 
lishing Company, Milwaukee, Wisconsin. 
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of work and amount of instruction in minutes. Thus the teacher will 
be able to compare the median achievement of his class with the 
achievement of other similar classes that have had the same amount 
of instructional time or that have been in the course the same nmnber 
of semesters. 


TABLE 8 


SiiowiNo Median Scons Nohms, Baskd on a Non-time and a Time Situation 

Junior High School 



■ 

Second 

Semes¬ 

ter 

Third 

Semes¬ 

ter 

Pourth 

Semes¬ 

ter 

Fifth 

Semes¬ 

ter 

Sixth 

Semes¬ 

ter 

Seventh 

Semes¬ 

ter 

End of sem- 

1400 

2400 

3400 




17,000 

ester work 
Non-time 

minutes 

minutes 

minutes 

minutes ' 

minutes 

minutes 

minutes 

median score 
Time me- 

36 

45 

65 

60 

66 

75 

83 

dian score 

44 

47 

63 

68 

64 

71 

80 


Senior High School 


Eighth Semester 

Ninth Semester and up 

Possible Score 

25,000 minutes 

32,000 minutes 


90 

105 

184 

86 

101 

199 


13. Motivation of Student Learning. 

Tests and examinations have long been recognized by teachers as 
useful motivation devices. Many teachers have not realized, however, 
that the extent of this utility depends upon the character of the tests 
themselves. If the test is so constructed that it permits superficial 
thinking and shallow answers it stimulates precisely that type of work. 
If it calls for critical thinking, exact results, concise statements, care¬ 
ful evaluation of facts, then the force of the motivation is in the right 
direction. The use of even a moderately good test or examination may 
accomplish much in the way of stimulating proper habits of work on 
the part of the pupils. Sometimes even the mere administration of 
the test, or the knowledge on the part of the pupils that it is to be 
given, has a desirable effect. The greatest good, however, comes from 
the use of a carefully standardized test or scale, followed by the exact 
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location of individual pupil weaknesses and the application of cor¬ 
rective measures immediately after their discovery. The best experi¬ 
mental evidence shows that significant gains in pupil accomplishment 
accompany the sane use of properly constructed tests in such a way 
that the pupil liimself is aware of his accomplishments and limitations. 

14. Determination of Efficiency of Instruction. 

These comparisons are interesting and often valuable as general 
guides, but if pupils arc making low scores it is much more important 
to know where the scores are low. Is it in lettering, lack of textbook 
knowledge, poor technique, wrong type of instruction sheets, dull tools, 
or lack of interest? Just what arc the conditions whic-h cause the 
class to be lower than it should be on a standardized test? By a care¬ 
ful analysis of results it is often possible to determine weak points in 
the achievement of the class. A chart with the numbers of the test 
items in the standardized test on the left-hand side and the number of 
pupils getting the item correct on the right side is very useful for this 
purpose. Table 9 gives an example of this type of instructional anal¬ 
ysis from a class of twenty eighth-grade boys as tested by Form B 
of the Newkirk-Stoddard Home Mechanics Test.^ The test was given 
at the end of one semester of instruction. An examination of Table 9 
shows that items 1, 6, 8, 12, 16, 20, 23, 24, 26, 27, and 34 are low in 
the numbers of pupils responding correctly. This analysis indicates 
that the jobs which correspond to these numbers probably were not 
taught effectively or at least were not properly mastered by the class 
As a matter of fact, in this particular case, the main reasons for 
the ineffective teaching were lack of supplies for teaching the jobs 
properly, poor demonstrations, no supplementary references, and a 
lack of instructional time on the part of the teacher due to an unduly 
heavy teaching load in other branches. 

15. Class Diagnosis. 

Standardized testa of achievement are of value to industrial educa¬ 
tion teachers in determining the difficulties and abilities of the various 
members of the class. It is generally known that the background and 
abilities of the individual members of any class may vary widely. If 
the shop teacher is able to secure an accurate picture of the informa¬ 
tion and skills that the pupils already have when they enter the class, 
it will be of great value in placing the emphasis so the greatest in¬ 
structional efficiency will result from the time allotted. This type of 

2 Newkirk, L V., and Stoddard, George D., NewUrk-Stoddard Home Mechartr 
ics Test, Bureau of Educational Research and Service, Iowa City, Iowa, 1928. 
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TABLE 9 

Analysis of Class Insthuctional Weakness in Home Mechanics 
Newkirk-Sioddard Home Mechanics Test, Foim B 
Test Items Number Correct Responses 


1 

6 

2 

18 

3 

20 

4 

18 

5 

15 

6 

3 

7 

17 

8 

2 

9 

20 

10 

18 

11 

15 

12 

6 

13 

15 

14 

14 

15 

18 

16 

5 

17 

3 

18 

20 

19 

18 

20 

2 

21 

18 

22 

17 

23 

1 

24 

3 

25 

18 

26 

4 

27 

1 

28 

14 

29 

18 

30 

17 

31 

15 

32 

14 

33 

16 

34 

2 

35 

13 

36 

11 


information is especially useful to industrial education teachers who 
are teaching advanced classes, but it is valuable in all classes on the 
secondary level. 

Even in small classes it is obvious that there is wide variability in 
the number of jobs that the pupils already know how to do and also 
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wide variability in the specific jobs. Table 10 shows this very clearly 
by data obtained by giving the Newkirk-Stoddard Home Mechanics 
Test to a class of nine eighth-grade boys at the University of Iowa 
High School to determine which items in home mechanics they already 
knew and to see where to put the instructional emphasis for each 
pupil. 

TABLE 10 

Number op Items in Nbwkirk-Stoddabo Home Mechanics Test Each Pupil 

Answered ConnECTLyC 

Pupil Items 


Form A 

L 12; IS 

T 1, 2; 4; 5; 14 

P 1; 2; 4; 5; 9; 10; 15; 30 

S 1; 0,4; 5; 22 

D 2; 13. 33; 34, 35; 36 

H 1; 2; 4; 5; 9; 12, 13; 23 

Z 1; 2; 4; 15; 36 

W 1; 2, 4; 5; 6; 8; 18, 30, 32; 34 

M 2; 6; 7; 8; 9; 12; 17; 26, 29 


Form B 

1 

2, 5, 14; 17; 32; 36 

1, 2; 4, 5; 10; 12; 13; 14; 16, 29; 
34 

1; 4; 15; 16; 28 

2. 4; 5; 16; 33; 34; 35 

1; 2; 6; 8, 10; 12; 14; 15; 33, 34 
4; 6; 12, 34 

2; 4; 13; 14; 16; 17, 26; 28, 29; 
33 

2; 3; 4; 6, 8, 9; 10, 11, 13; 16; 
18; 19; 21, 23; 28; 29; 33; 34, 
36 


Out of the seventy-two items in the two forms of the test, only 
twenty-three were not answered correctly by some of the pupils. The 
highest score that any one received was twenty-eight jobs right 
(pupil M). The results are very valuable from the standpoint of in¬ 
structional efficiency,because this pupil will not have to spend time 
repeating material with which he is familiar. In the case of pupil L, 
who scored only three right, the test has identified an individual who 
needs careful attention. To the pupil himself it is a clear indication 
of the need for more instruction. 

Information of this type is valuable not only for increasing in¬ 
structional efficiency, but also for motivating the pupils on their proper 
level of accomplishment. Unless the teacher has previous knowledge 
that pupil M knows how to do twenty-eight of the jobs specified he is 
quite likely to waste his own and the pupil’s time through useless repe¬ 
tition. This usually results in building up bad habits of work on the 

® Newkirk, L. V., Validating and Testing Home Mechanics, University of 
Iowa Study m Education, University of Iowa, Iowa City, Iowa, Series 201, 1931, 
pp. 30-31. 
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part of the pupil. Furthermore, the teacher might assume that pupil 
M did not know how to do any of the tasks in the test when as a 
matter of fact he knows much of what he has to learn. For example, 
pupil L who knows three jobs might learn twenty more and his score 
would then be twenty-three, and pupil M who knows twenty-eight 
might learn ten more and have a score of thirty-eight. The boy who 
had learned twenty would have accomplished more, but the final score 
would not indicate that he had accomplished twice as much. In fact, 
it would give the impression that he had accomplished fifteen less. 
This merely illustrates the need for giving industrial education tests 
at the beginning of a course to discover what is already known, during 
the semester for indications of progress and for motivation, and at the 
close of the semester to measure accomplishment and growth. 

16. Individual Pupil Diagnosis. 

Closely related to the measurement of class and individual levels 
of accomplishment is the diagnosis of individual learning difficulties of 
certain pupils in the class. Just as in the other instructional fields, the 
teacher may assume that these pupils are naturally slow or do not try. 
It frequently occurs, however, that upon closer examination these 
slower pupils have many learning difficulties which can be corrected by 
the application of proper remedial teaching. The possible causes of 
these difficulties are numerous; they may be one or more of the fol¬ 
lowing: malnutrition, defective eyesight, difficulty in hearing, poor 
reading ability, poor technique in manipulation of some or all tools, 
inability to adjust tools, inability to read a working drawing, ignor¬ 
ance of sizes of tools, unfamiliarity with related mathematics, low 
mechanical ability, emotional maladjustment, social maladjustment, 
and low intelligence. These difficulties are usually obvious in extreme 
cases, but the majority of the pupils in the class may have one or more 
of the difficulties which will seriously affect his ability to profit from 
the instruction. It is on this account that the industrial education 
teacher needs as much professional information about his pupils as it 
is possible to secure in order better to adapt his instruction to the indi¬ 
vidual differences and abilities of his pupils. 

The efficient shop teacher must know how to test many factors 
other than those that relate directly to achievement in industrial edu¬ 
cation. Fundamentally he is a teacher of individuals and not a 
teacher of drawing, woodwork, metal work, electricity, printing, or 
auto mechanics. Table 11 illustrates types of information which in¬ 
dustrial education teachers will find useful in their teaching and 
guidance activities. 
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TABLE 11 

Desihable Types op Pbofessionae Information 


Name 

John J. ... 


Grade 

. 10 . 




Test 




ScOl'O 



Intelligence test 




112 



Reading 




Ha 



Language 







Spelling 




55 



Writing (quality) 




60 



Mathematic.^ 




75 



Mechanical aptitude 




120 


Hypothetical grade norms for the lest scores in Table 11 

are given 

as follows: 






lalciligcnoe 




Math- 

Mechanical 

Grade 

Test Reading Language Spelling 

Writing 

ematics 

Ability 

7 

90 40 

20 

40 

60 

30 

70 

8 

96 45 

32 

45 

65 

40 

80 

9 

100 68 

45 

50 

72 

50 

100 

10 

110 eo 

60 

71 

80 

75 

130 

11 

122 65 

60 

75 

80 

80 

135 

12 

136 75 

65 

82 

99 

84 

150 


The information in Table 11 gives Lhc achievements obtained by 
a tenth-grade pupil on a number of tests, and the hypothetical grade 
norms indicate what the pupil’s level of achievement should be. The 
levels of accomplishment are indicated graphically in Fig, 2. 
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The profile chart shows that John J. has intelligence slightly better 
than a tenth-grade pupil, the reading and language ability of a sev¬ 
enth-grade pupil, spelling ability a little above that of a ninth-grade 
pupil, writing quality equal to that of a seventh-grade pupil, mathe¬ 
matical and mechanical ability equal to that of a tenth-grade pupil. 
Assume further that this pupil is in the tenth grade in electrical shop 
and is making slow progress. The teacher in the electric shop is 
using instruction sheets and related reference materials as supple¬ 
mentary teaching devices. The students are required to do consid¬ 
erable reading and to write out the answers to the questions on the 
individual instruction sheets. By studying the chart in Fig. 2 it is 
obvious why this pupil has difficulty in making satisfactory progress. 
He cannot read well and is a poor writer, although he has intelligence 
and mathematical and mechanical ability adequate for doing good 
work in the course. The remedy here is special instruction in reading 
and language with additional emphasis on writing legibly. 

All the illustrations used here have been on the basis of grade 
norms because they are easy to compute and illustrate the different 
levels of accomplishment. However, many industrial education teach¬ 
ers may wish to classify pupils on the basis of ability to learn, and 
reveal progress by using mental ages and achievement ages of the 
pupils in their classes. The various types of norms are discussed in 
Chapter XV. 

17. Gradation and Guidance. 

Tests of intelligence and mechanical and special aptitudes are of 
value to supervisors and teachers of industrial education in classifying 
pupils with approximately equal learning power. In a large school 
system where there are several sections of a class, it is usually con¬ 
sidered desirable for instructional purposes to classify pupils into 
groups of about equal learning abilities. It is easier to meet the indi¬ 
vidual learning difficulties of a group of pupils if they have about the 
same general intelligence. In the small school it may not be possible 
to divide the pupils into instructional groups of approximately equal 
learning ability, but usually the classes are small and the teacher has 
more time for individual instruction. The industrial education teacher 
may divide classes on the basis of either mechanical ability or intel¬ 
ligence. Both these factors are important in shop instruction, but they 
do not correlate highly. Some pupils have low mechanical ability and 
high intelligence, others high mechanical ability and low intelligence, 
as indicated by intelligence tests. Scores on intelligence tests and 
scores on mechanical-ability tests usually have a positive correlation 
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of between .20 and ,30. Of course, the large majority of pupils have 
about average mechanical ability and average intelligence. If an in¬ 
dustrial education class is selected on the basis of scores on intelligence 
tests the result will be a class with similar intelligence ratings but with 
variable mechanical aptitude. If selected on the basis of mechanical 
aptitude the intelligence ratings will be variable. 

In the first two years of the junior high school where information 
about tools, materials, and industries is considered more important 
than acquiring outstanding tool skill, it seems desirable to section 
classes on the basis of intelligence scores, because of the nature of the 
learning problems. In advanced courses where trade training and the 
acquiring of trade skill are the dominant objectives, it probably is 
better to classify pupils on the basis of mechanical aptitude, since that 
is of vital importance in acquiring outstanding skill in manipulating 
tools and materials. In either case, it would be desirable to have 
both ratings for use in adapting instruction to the individual diffi¬ 
culties of the pupils. 

Tests of mechanical aptitude or mental ability are usually admin¬ 
istered by the supervisory officers in the school or by persons espe¬ 
cially trained in administering tests. If this practice is followed, the 
industrial education teacher can frequently get the necessary informa¬ 
tion from the central office of the school. However, the needed infor¬ 
mation is not always available, and oftentimes it is necessary to checlc 
uncertain, scores or to test pupils who have recently entered school or 
for whom test scores are not available. Industrial education teachers 
also need information about giving and scoring tests of special abili¬ 
ties so that the scores and the implications for their use will be clear. 

The chief danger in using test scores for gradation or guidance pur¬ 
poses is that they may not be interpreted in the light of their true 
meaning. The scores from the best educational measures are not so 
reliable for individual diagnosis as they are for indicating general 
trends or levels of accomplishment. It has been found that test scores 
which are very high or very low are most likely to be in error. The 
combined scores of several similar tests can be used with more cer¬ 
tainty in diagnosis of pupil difficulties than any one score. Before 
very high or very low scores are used they should be rechecked by 
giving similar tests or other forms of the same test. Scores obtained 
on carefully prepared educational tests are more accurate than the 
teacher’s subjective judgment, but they are not accurate enough to be 
considered final and used dogmatically. 

Teachers of industrial education should be very careful not to con¬ 
fuse the purposes of aptitude and special-ability tests with achieve- 
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ment tests. A pupil who has a high score on tests of intelligence and 
mechanical ability, other factors being equal, should do good work in 
industrial education courses. The fact that a pupil has these abilities 
does not necessarily mean that he should receive a high mark. Re¬ 
gardless of a pupil’s abilities, he should be marked on the basis of 
actual achievement in the course taken. Standardized and teacher- 
made tests should be utilized for measuring achievement and the re¬ 
sults used as a major factor m assigning shop marks. Testa of spe¬ 
cial abilities are valuable in guidance, in classification of pupils, and 
in pointing out individual pupil difficulties. They are not of particu¬ 
lar value in measuring the amount of information or skill acquired in 
industrial education courses. 

18. Tests in Research. 

One of the obligations of a teacher to his profession is to discover 
new truths which can be applied for the improvement of work in his 
chosen field of endeavor. Carefully constructed educational tests can 
be used to discover new and better ways of organizing and teaching 
industrial education. It does not seem likely that a scientific method 
of instruction can be developed in any instructional field without 
suitable measures of achievement and abilities. The following are 
examples of a few of the problems which could be solved in part 
through the use of adequate tests. 

1. Wliat are the relative values of different teaching methods for industrial 

education subjects (use of demonstrations, instruction sheets, class in¬ 
struction, individual instruction) 7 

2. What type of shop organization is most effective (composite, unit) 7 

3. How much instructional time should be given to lecture, demonstration, 

and individual instruction7 

i. What types of individual instruction sheets are most effective at different 

grade levels? 

5. What is the proper size of a class in drawing, sheet metal, machine shop, 

foundry, woodwork, auto mechanics, printing, and the general shop? 

6. What is the most economical length of period to be used in industrial 

education instruction? 

7. What is the most effective classification of instructional materials in indus¬ 

trial education courses on the basis of grade accomplishment? 

SUMMARY 

Educaitional measurGmGnts ha-vG the following gonoral usgs in in¬ 
dustrial education: to measure class and pupil achievement, to estab¬ 
lish standards of performance, to motivate learning, to diagnose pupil 
learning difficulties, to mark and promote pupils, to classify pupils ac¬ 
cording to abilities, and to study the effectiveness of teaching methods. 
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Educational measui'cmeni is a fundamental factor in teaching indus¬ 
trial education. Standardized and teachcr-made tests are valuable in 
measuring achievement. Aptitude tests are valuable in guidance and 
diagnosing individual difficulties. Teachers need a great deal of pro¬ 
fessional information about their pupils other than measures of 
achievement if their courses of instruction are to be effectively adapted 
to individual needs of their pupils. 

SUMMARY EXERCISES FOR DISCUSSION 

1. List the inajor factois which would make it difficult for the olassioom teacher 

to construct tests which will have the merits of carefully constmeted stand¬ 
ardized tests, 

2. Enumerate and illustrate the six mam uses of testa in industrial education. 

3, Show how tests of intelligence and special aptitudes may be used for grada¬ 

tion and guidance purposes 

4, What IS the teacher’s icsponsibihty for the use and interpistation of standard 

tests and scales in the classroom and shop^ 
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CHAPTER IV 


SELECTION AND EVALUATION OF TESTS 

I. CRITERIA FOR INDUSTRIAL EDUCATION TESTS 

Several characteristics of a good test should be considered by the 
shop teacher in evaluating published tests or tests of his own construc¬ 
tion. The most important of these are validity, reliability, objectivity, 
adequate norms, the existence of duplicate and equivalent forms, ease 
of administration, and economy. An understanding of those factors 
will do much to insure the selection or construction of a test suitable 
for the testing problem at hand. 

19. Validity. 

The general concept of validity in a test may be made clear by 
thinking of the conditions set up by the test as a small sampling 
of a larger life situation. At the outset it is assumed that the field 
which the test samples is of some real importance. If this is the case, 
then the more nearly the conditions set up in the test itself duplicate 
the larger situation as found in life the more valid it becomes. For 
example, it would doubtless be possible to prepare a laboratory test 
designed to measure one's ability to handle an automobile in heavy 
city traffic and one’s reactions to the situations encountered there. 
It would be much more practical (valid) to bring the subject into 
direct contact with a bit of heavy traffic and determine exactly how 
he does react to it. 

From the point of view of the classroom teacher, validity usually 
is concerned with the question of whether the materials tested are 
actually of real significance, and whether the pupil has had any ade¬ 
quate opportunity to master the facts tested as a result of his contact 
with the course of study taught. Validity may be defined as some 
type of objective expression of the degree to which the particular 
measuring instrument measures what it is supposed to measure. That 
is to say, a test which is designed to measure ability to read blueprints 
after a short period of training, and later is found to be a better test 
of general intelligence, would be considered to be lacking in validity 
for the purpose for which it was designed. Validity is usually ex¬ 
pressed in terms of the correspondence of results obtained from the 



32 


SELECTION AND EVALUATION OF TESTS 


particular measuring device under consideration and other, similar 
instruments of previously determined validity. Very often it is im¬ 
possible to secure measures from other instruments of known validity. 
In these cases it is a common practice to refer to estimates or judg¬ 
ments of individuals who have had an opportunity to evaluate in a 
rather definite way the abilities of the individuals involved in the 
validation study. Frequently validity is determined by the extent 
to which a test calls into play the skills and abilities which experi¬ 
enced observers consider fundamental to success in the given field. 
The validity of many of the items in the Newlnrk-Stoddard Home 
Mechanics Test is dependent to a large degree upon the agreement 
of certain teachers, supervisors, and other qualified authorities that 
the processes called for are the significant ones. 

The validation of the content of this test was achieved in part 
by the pooled judgment of experienced teachers, home owners, and 
tradesmen. The home owners indicated the projects and content 
which they believed to be important in the maintenance and operation 
of the home. The teachers of home mechanics in 75 schools marked 
the jobs which they considered most important. In developing the 
procedure type of question used in the test, it was necessary to have 
the procedures checked against good trade practice by tradesmen. 
Table 12 gives the ton most frequently occurring home mechanics 


TABLE 12 


Ten High-rankinq Home Mechanics Jobs According to 100 Home Owners 



Job 

Frequency 

1. 

To sharpen Icnives 

9S 

2. 

To install a. pair o£ hinges 

58 

3. 

To put new screen on a window or door 

65 

4. 

To connect batteries 

95 

6. 

To shape the point of a screw-drivei 

95 

6. 

To wash a window 

95 

7. 

To use glue for general repair 

94 

8. 

To regulate a watch or clock 

94 

' 9. 

To fire a furnace 

94 

10. 

To locate a blown fuse and replace 

04 


projects according to the judgment of 100 home owners living ir 
small towns in the middle west. Table 13 gives the ten highest-rank- 
ing jobs in home mechanics according to the judgment of 75 teachers 

^ procedure rearrangement questioi 
taken from the NewM-Stoddard Home Mechanics Test, the numbers 
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in the parentheses indicating the best trade procedure according to 
the five tradesmen who judged it. 

In many of tlie achievement tests, validity depends to a large de¬ 
gree upon the opportunity which the pupil has had to master the 
information covered by the test. The validity of a test may be 
thought of as being general, or it may be considered as being specific 

TABLE 13 

Ten HinK-KANKiNG Home Mechanics .Jobs Accobuing to 75 Home Mechanics 

Teach BUS 


Job Frequency 

1. To make suilablc splices, tops, and teimmals in electric wires 64 

2. To tin a soldering coppei 63 

3. To wiie an electne-bght socket 63 

4. To mend leaks in kitchen utensils 62 

5. To make an extension cord 62 

6 To wile simple bell cireuits 61 

7. To apply stain and filler 61 

8. To apply vamiali 60 

9. To cut glass to size 60 

10 To repair leaking compicssion lancet 57 


TABLE 14 

A Eeabranoement Qub.stion with the Answeb as Appboved dy 6 Tbadesmen 


To Cut a Piece of Pipe 
Procedure: (1) 

(2) 

(3) 

(4) 

(5) 

(6) 


Set the cutter on the mark. 
Ream the end. 

Determine the length of pipe. 
Cut the pipe 
Measure and mark. 

Adjust the pipe cutter. 

(3) (5) (6) (1) 


(4) 


( 2 ) 


Many tests are undoubtedly valid in a general sense but are lacking 
in validity in a specific sense. For instance, a survey test designed 
to secure a general bird’s-eye-view of achievement in a particular 
subject must be validated in terms of its ability to test for the basic 
items found in the courses of study, not of a single class in which 
certain points of view and certain facts have been emphasized, but of 
the many different schools in which it may be used. For his own 
particular class a teacher may easily construct a test which will have 
much greater specific validity for his purposes and his point of view 





34 


SELECTION AND EVALUATION OF TESTS 


than any type of commercjal standardized test could possibly have. 
Recently considerable recognition has been given to this phase of 
validity in tests by providing for the classroom teacher source books 
of objective test exercises in a number of subject matter fields.^ Such 
material permits the classroom teacher to secure a relatively high 
specific validity for his tests and quizzes. 


20. Reliability. 

The rehahility of a test may be thought of as the consistency with 
which it performs. In a certain sense this matter of consistency of 
performance of a test arises from two factors, the adequacy of the 
sampling represented by the test, and variations in the human re¬ 
sponse itself which have nothing to do with the content of the test. 
The first of these can be controlled somewhat by selecting the test 
items carefully and extensively from the field which it is supposed 
to measure. The principle of sampling may be illustrated by the prac¬ 
tice of the large producers of ore. Obviously it would be impossible 
to examine and test every cubic foot of the ore in every carload. 
It is a simple matter, however, for specimens of the ore to be taken 
from different parts of the car and from different cars. These speci¬ 
mens are carefully mixed together and subjected to the tests which 
determine the quality and price of the ore. This process is called 
sampling. If only one specimen were taken from each car there would 
always be the possibility that the ore at that particular spot might 
have been unusually rich or poor. Taking more and more specimens 
increases the likelihood that the resulting sample will be truly repre¬ 
sentative of the ore in the car. In a similar way, increasing the 
number of samples taken in a testing situation makes it more likely 
that some important phase of the subject may not have been missed 
or given the wrong emphasis, or that some interfering human factor 
may have been operating at the time the samples were taken 

The accompanying diagram illustrates the effect of sampling on 
the reliability of a test over a limited field of information. Each 
of the small rectangular spaces in Fig 3 represents an item which a 
student has had an opportunity to learn The thirty shaded portions 
represent the items which he has actually learned. The ten unshaded 
spaces are items which he has not learned. In this illustration the 
student has a mastery of 75 per cent of the items. Now let it be 
assumed that a test over this material is prepared comprising ten 
Items, numbers 1, 2, 3, 6, 7, 8, 10, 12, 16, and 22. If thes; items are 

J and Greene, H A.. Pupil-Teacher Handbooks of Obfeo- 

F„bli. Sd..oI 
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selected, this individual will fail on six of the ten, making a percentage 
score of 40 per cent. However, if ten other items, as numbers 5, 9, 
13, 17, 20, 24, 27, 31, 34, 37 are selected over the entire range of his 
field of information he should be able to answer nine of the ten cor¬ 
rectly. From this it is clear that it makes a distinct difference where 
the sampling is taken. It may cover material learned but may come 
from too limited a portion of the total field to be truly representative 


of the individual’s accomplishment. In 
this illustration, if the even numbered 
items are chosen, the pupil would prob¬ 
ably fail on nine of the twenty items. 
If the odd-numbered items are chosen he 
should fail on only one. This results in 
a variation in his score from 45 to 95 per 
cent. As the number of items chosen 
for inclusion in the test is increased the 
pupil’s scores on the test exercises more 
nearly approach the actual amount of 
his information in the field. It thus be¬ 
comes apparent that the extent of the 
sampling is also an important feature 
of reliability in a test. 

As the reliability of a test is in¬ 
creased, either through more extensive 
or more representive sampling, the 
operation of chance variations such as 
temporary disturbances, breaking a pen¬ 
cil, and the like, is minimized. Simi¬ 
larly, increasing the sampling of the test 
by increasing the number of different 



times the pupil is required to respond to —The Principle of 

it tends to limit the effect of physical Sampling 


disturbances, fatigue, emotional stress, etc. Thus, practically every 
attempt to increase the consistency with which educational tests meas¬ 


ure abilities and achievements results in the increase in the length of 
the test and testing period. It is becoming increasingly clear that com¬ 
plex fields cannot be measured reliably by means of brief tests. 


The reliability of a published tost should be given in the manual 
of directions accompanying the test. This information is given as a 
coefficient of reliability. The coefficient of reliability is a statistical 
expression of the consistency of performance of the test and how much 
reliance may be placed on scores obtained from its use. Reliability 
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coefficients are indicated in decimal fractions of 1, usually ranging 
in value from .40 for tests of low reliability to .95 for tests of 
very high reliability. It is improbable that a test will be entirely 
inconsistent or perfectly consistent. However, it is difficult to state 
exactly how low or how high a coefficient of reliability of a test 
should be for the test to be of value. Much depends on the type of 
test and how it is to be used, The data in Table 15 will give some 
appreciation of a suitable coefficient of reliability. 


TABLE 15 


Reliability Coefficients 


Coefficient of Reliability 

Rating 

.50- .60 

Very low 

.60- .70 

Low 

.70- 80 

Fair 

.80- 85 

Average 

.85- ,90 

Good 

,90- 95 

High 

96-1.00 

Excellent 


In general, it is doubtful if a test should be given very serious con¬ 
sideration if it has a reliability coefficient of less than .80 when 
stepped up by the application of the Spearman-Brown technique from 
correlations on chance-half samplings of the exercises. Some experi¬ 
ence with long and quite reliable tests in the field of silent reading 
indicates that reliability coefficients which range as high as .96 when 
computed on the odd-even basis drop as low as .90 when results from 
the two equivalent forms of the test are correlated. Reliabilities of 
.90 based on the actual relationships between two equivalent forms of 
the test may be considered very high. Additional discussion of the 
meaning and significance of reliability, as well as a more complete 
explanation of the methods of securing and computing reliability data, 
are given in Chapter XIV. 

21. Objectivity. 

Objectivity is an important quality in a test exercise since it con¬ 
tributes indirectly to validity and reliability. Objectivity is that 
quality in a test exercise which makes for the elimination of the per¬ 
sonal judgment of the person who scores it. Since this means greater 
accuracy in the grading of such items it naturally indicates greater 
reliability in the measurement of whatever qualities are being meas- 
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ured. Objectivity is a function of the form in which the test items are 
stated. In general, objective-test items arc so formulated that only 
one correct response satisfies the conditions of the exercise. Hecall, 
true-false, multiple-answer exercises are all illustrations of objective 
forms. These and other common forms of objective exercises are de¬ 
scribed and illustrated on pages 107-123 of this book. 

22. Ease of Administration. 

The speed, accuracy, and simplicity with which an educational 
test can be given in the classroom, though not a major criterion for 
tests, is nevertheless one that is worthy of some practical considera¬ 
tion. There is a very definite tendency in modern test development to 
recognize the teacher’s and the supervisor’s problems by making the 
tests easy to administer and simple to interpret. 

A significant phase of the administrability of a test is the ex¬ 
aminer’s manual which accompanies it. The manual should contain 
a clear statement of the qualities measured by the test. It should 
provide concise and simple directions for giving the tests so that they 
may be followed verbatim by the classroom teacher. It should pro¬ 
vide the critical user of the test with an adequate explanation of the 
methods by which the validity and the reliability data were obtained, 
as well as a concise statement of the meaning of these data in relation 
to the tests themselves. A convenient form of answer key or simple 
stencil for scoring should be supplemented by brief explanatory 
directions concerning the methods of scoring the tests. Simple illus¬ 
trations of the methods of interpreting the results should be given 
in the manual; and if the field is one in which a follow-up program 
of remedial or corrective instruction is possible, brief suggestions for 
such work should be made. 

The better standardized tests provide the user with carefully formu¬ 
lated statements on the following types of items, all of which tend 
to protect the pupils and the teacher against the faulty administration 
of the tests and wrong and uncritical interpretations of the results: 

1. Number of parts in the test. 

2. Directions for each part or division of the test. 

3. Sample or fore exercises to acquaint the pupils with the method 

of work. 

4. Directions for stopping at the end of a test part or for turning 

the page when necessary. 

5. Definite statements of time limits where required. 

6. Directions for scoring the tests. 

7. Explanations of the norms or standards of accomplishment. 
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8. Statement of the total possible scores on each test part and the 

method of securing them. 

9. Explanation and illustration of method of interpretation of 

results in terms of apparent instructional needs. 

10. Suggestions for definite remedial attack on the weaknesses re¬ 
vealed by the tests. 


23. Norms and Standards. 

Norms and standards, although frequently used as synonymous 
terms, are not exactly identical in meaning. Norms represent actual 
levels of accomplishment for specified groups of individuals. Stand¬ 
ards are usually considered as representing goals to be attained. One 
is what children or pupils are actually able to do; the other is what 
the teacher should strive to have them do. Practically all present-day 
tests are supplied with norms rather than standards. 

Norms are usually based upon the average or median accomplish¬ 
ment of large numbers of pupils grouped by ages or by grades. Grade 
norms result from the classification of the pupils by grades. Age 
norms result from the classification by ages. The norms, therefore, 
furnish the teacher with a definite basis for anticipating what given 
groups of pupils may be expected to achieve under ordinary school¬ 
room conditions. They thus afford the basis for the practical inter¬ 
pretation and evaluation of the testing program and of the classroom 
instruction under analysis. 

In general, norms, in order to have sufficient reliability for prac¬ 
tical classroom use, must be based upon rather large and carefully 
sampled populations. There is a growing tendency, however, to base 
standard test norms upon smaller groups of specially selected cases. 
In the past it has been assumed that the inclusion of a large popu¬ 
lation of unselected eighth-grade pupils would afford the best basis 
for an eighth-grade norm in a specific test field. Evidence brought 
forward by Crawford ^ indicates that this is not necessarily true. If 
there were a perfect balance in the proportion of pupils in a given 
grade who are retarded and accelerated as to mental ability and school 
pi ogress, such an unselccted group might provide suitable and repre¬ 
sentative norms. The actual evidence shows, however, that in the 
typical school grade the retardation (retarded progress) actually ex¬ 
ceeds the acceleration by a ratio of four to one and sometimes six 
to one. Accordingly, the norms based on unselected cases are not 
typical of the actual achievement of the normal individual in the 


= Crawford J. R„ Aqo and Progress Factors m Test Norms, University of Iowa 
Studies m Education, Vol. 9, No. 4, June, 1934. University of Iowa, Iowa City 
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group. Serious consideration is being given by modern test workers 
to the more exacting control of these variables. A number of the 
newer tests are reporting norms based upon smaller groups of in¬ 
dividuals selected for their normality for the group of which they are 
a part. In a number of cases age norms are based upon the results 
of grouping the individuals by means of mental-test scores. This 
procedure alone takes care of the serious difficulties arising when the 
grouping is by chronological age. To the extent that mental tests are 
standardized accurately, a twelve-year-old (mental age) individual 
is a twelve-year-old wherever he may be found. Every teacher knows, 
however, that a chronological twelve-year-old in the fifth grade is 
quite unlike one m the seventh grade. 

24. Mechanical Features. 

The mechanical features of a test frequently operate definitely to 
affect its case of administration in the classroom. They are largely 
the result of the editing and printing of the test. The paper should 
be of good quality, preferably white bond. The illustrations should 
be clear-cut and easily identified with the content they are supposed 
to illustrate. The page size, the length of line, and the size of type 
used are also mechanical features which may influence the usefulness 
of a test. 

25. Economy. 

Other things being equal, economy as a criterion for standard tests 
should undoubtedly be listed last. In the final analysis, any test 
which takes up class time may be counted expensive. Moreover, 
the cheap test is not always the most economical; in fact, quite the 
reverse is apt to be true. Teats costing at the rate of 50 cents per 
hundred and yielding results of limited validity and low reliability 
might readily be much more expensive than tests costing six times 
as much but having validity indices of .80 and reliability coefficients 
of .93. It is not at all unlikely that in the near future educational 
tests will be evaluated in terms of the number of units of valid and 
reliable information yielded per unit of cost. 

A modern tendency in test development involves the introduction 
of certain economy features in the booklet, either through the use of 
automatic scoring devices or through design which permits the repeated 
use of the test booklets. Such mechanical features in a test are quite 
acceptable so long as they do not force the test into too badly crowded 
a page, or do not interfere with the validity and reliability of the 
measurement which would otherwise be attained. 
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26. Number and Eq^uivaleace of Forms. 

The more useful educational tests are those which exist in multiple 
forms. The forms of a test are seeured by preparing two or more 
arrangements of similar but not identical test exercises and assigning 
them to different test booklets. These multiple forms should normally 
be approximately equal in difficulty, not only in terms of the total 
scores earned by groups of equally able individuals, but also in terms 
of the ratings of items in the different levels of the test. 

II. EVALUATION OF TESTS 

27. Test-Rating Scales. 

In the foregoing discussions no attempt has been made to evaluate 
the items mentioned in the criteria but merely to explain and discuss 
the types of information which are of value in selecting a published 
test or in judging the value of an objective shop test. However, a 
number of rating scales have been developed .in which the different 
factors are weighted and give a test a rating on the basis of 100 
points. These scales are useful to all teachers in selecting tests but 
are especially valuable to the inexperienced teacher until he becomes 
accustomed to Judging the different items. The Otis test-rating scale 
is reproduced here. 

Oi'is Test-Rating Scale s 
Manual (5) 

Validity (15) 

Reliability (10) 

Reputation (5) 

Ease of administration (total 15) 

(a) Preparation (4) 

(b) Time limits (4) 

(c) Explanation needed (3) 

(d) Alternative forms (4) 

Ease of scoring (total 15) 

(a) Objectivity (10) 

(b) Time required (3) 

(c) Simplicity (2) 

Ease of interpretation (total 16) 

(a) Norms (5) 

(b) Directions for interpreting (4) 

(o) Class record (1) 

(d) Application of results (5) 

Convenient packages (5) 

Typography and make-up (5) 

Teat service (10) 

Total 100 

» Otis, A. S., "Scale for Rating Tests,” Test Service Bulletin No. 13. Yonkers; 
World Book Company. 6 pp. 
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III. SUMMARY 

The teacher should select his tests with care in order to get full 
value for the money and time expended for testing purposes. The 
chief factors to be considered in selecting a test are validity, reli¬ 
ability, objectivity, norms, multiplicity and equivalence of forms, ease 
of administration, and cost. Validity is the degree to which a measur¬ 
ing instrument measures the thing it purports to measure. Reliability 
is the consistency of performance of the test itself. Objectivity is de¬ 
termined by the form of exercise, which in turn controls the number 
of acceptable answers for each question. Norms are the median or 
average performance of pupils of different ages or grades as deter¬ 
mined by the testing of large numbers of individuals. Standards are 
desirable ultimate goals of attainment. Tests should be economical 
of time and money. The Otis test-rating scale is suggested as a use¬ 
ful guide for the shop teacher in the selection of commercial tests. 


SUMMARY EXERCISES FOR DISCUSSION 

1. Formulate a concise definition of each of the major criteria for the selection 

of an educational test. 

2. Illustrate at least three typos of procedure which may be used m the valida¬ 

tion of industriiil education lest item!?. 

3. Show by means of a concrete illustration how sampling affects the reliability 

of a test. 

4 What are some of the most effective devices for making test exercises 
objective? 

5, In your judgment, is the apparent difference between norms and standards 
a matter of any practical significance? 

G. Secure at least one complete sample sot of standaidized tests suitable for 
use m industrial education classes. Examine the tests, the manuals, keys, 
and other supplementary material and rate the tost in detail, using the 
Otis Score Card for Rating Tests Consider each item in the light of the 
explanations given with the score card, and assign a value to each point in 
propoition to its apparent merit. 
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CHAPTER V 


MEASURABLE FACTORS IN INDUSTRIAL EDUCATION 

28. Factors Related to Mastery in Industrial Education. 

Industrial education teachers, in common with teachers in most 
other subjects, need a great deal of precise information about the 
pupils in their classes if they are to work effectively. Most of the 
data needed can be obtained through measures which are much supe¬ 
rior to the teacher’s unaided judgment. Some of these measuring in¬ 
struments are still to be developed, and it is certain that many now 
available need refinement It is the purpose of this chapter to point 
out some of the measurable factors in industrial education. These 
factors present a real challenge to teachers of industrial education. 
Educational guidance and shop instruction both doubtless would be 
markedly improved through the measurement of these factors and the 
wise use of the results in the classroom. These measurable factors 
are enumerated in Table 16, and each will be discussed in this chapter. 

TABLE 1(J 

Measur.able Factobs in Industrial Education 


Measurable Factors 


Comments 


1. Information 


2 Quality 


3. Technique 


4. Speed (Rate of response) 


Factual information about tools, mate¬ 
rials, and vocations. (Oak is a cabinet 
■wood; the micrometer is an instrument 
used to measure in thousandths of an 
inch; outside paint contains oil.) 
Evaluation of the product of manipula¬ 
tive work in the light of tool, instru¬ 
ment, or machine operations (drawing, 
hammeiing, house wiring, bookcase, ce¬ 
ment lawn pedestal). 

Evaluation of skill in manipulating 
tools, instruments, or machines in exe¬ 
cuting tool operations (method of using 
a plane, a compass, a lathe). 

The time required to accomplish a piece 
of work employmg commercial standards 
(time required to make a drawing, a 
table, wire a house). 

43 
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TABLE 10— {Continued) 

Measurable Eaciors in Indestrial Education 

Measurable Factois Comments 


5. Reading technical symbols 
G Reading 

7. Spelling 

8. Matliematioa 

9. Appreciation of industrial prod¬ 

ucts 

10 Planning 

11. Language 

12. Inventiveness 

13. Personality Traits 

14. Mechanical aptitude 

15. Intelligence 


Ability to read working drawings, wiling 
diagrams, arolutectural drawings, etc. 
Ability to read and comprehend instnic- 
tiona or related information from the 
printed page. 

Evaluation of ability to spell common 
words and necessary technical terms. 
Evaluation of mathematics required in 
the various shop courses (woodwork, 
drawing, shoot metal, machine shop, 
home mechanics). 

Evaluation of ability to rank industrial 
pioducts according to merit (furnitiuc, 
electrical devices, finishes, automobiles, 
houses, radio). 

Evaluation of ability to develop a suit¬ 
able plan for doing a job (building a 
lawn bench, a dog house, a fence, a 
radio, etc). 

Ability to use correct English in written 
and oral form. 

Ability to see new relations and develop 
devices and machines for the improve¬ 
ment of society. 

Rating of traits generally recognized as 
essential to success (industry, coopera¬ 
tion, consideration for others, self-re¬ 
liance, aggiessiveness). 

Natural aptitude for manipulating me¬ 
chanical devices and an undei standing 
of their operation. 

Ability of an individual to learn as 
measured in terms of the extent a pupil 
has acquired a numbei of specific and 
largely uni elated abilities. 


_ These fifteen factors are obviously closely related to mastery in 
industrial education. To the extent that they are basic they should 
represent the framework of measurement in this field. It is probable 
that they can be measured most effectively and the results used to 
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best advantage when each is tested separately. It should not be un¬ 
duly difficult to construct devices for the measurement of information, 
tool techniques, quality, and speed in more or less isolated forms as a 
basis for a real analysis of achievement in this subj ect. It is believed, 
therefore, that these categories are of sufficient importance to warrant 
the elaboration and discussion of each in order of appearance in 
Table 16. 

InfoTTnation. Industrial education courses contain a great deal of j 
knowledge about tools, machines, materials, and industries. This is' 
especially true in the iunior high school where a great deal of emphasisj 
is placed on outlook and cultural training and less on the strictly voca-; 
tional aspects of the subject. The vocational courses also contain 
much instructional material which is designed to develop knowledge 
about tools and materials rather than skill in the actual modification j 
of materials through the use of tools and machines. This information J 
or knowledge is similar in all its psychological aspects to knowledge 
in other subjects of the school curriculum and can be measured effec-| 
tively with the common type of objective examination. In fact, the 
majority of the tests of achievement which to date have appeared in 
industrial education are measures of information. The same tech¬ 
niques which have been used for construeting tests in other subjects 
can be applied with only slight modification. However, there is still 
a great deal of work to be done before an adequate supply of measures 
of information will be available for the content of the varied subjects 
of industrial education. Some indication of the extent of this problem 
will be gained from an examination of the accompanying outline of 
informational items on woodworking adapted from the American Vo¬ 
cational Association committee’s report ^ on standards of accomplish¬ 
ment in the industrial arts. 

EXAMPLES OF THE INFORMATIONAL CONTENT OF 
INDUSTRIAL EDUCATION 

WOODWOBKINO 

A. Lumber. 

1. Know the principal characteristics, llio working qualities, the principal 
uses, and the sources of supply of the following woods' pine, cypress, 
oak, walnut, birch, maple, mahogany, rod cedar, hickory, gum, chestnut, 
poplar. 

2. Know the methods of cutting and milling lumber. 

3. Know how lumber is dried 

Report of American Vocational Association Committee on Standards of 
Attainment in Industrial Arts, Bulletin oj American Vocational Association, In¬ 
dustrial Arts Section, pp. 2G-31, December. 1931. 
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4. Know the effect of moisture on wood. 

5. Know the standard dimensions of lumber and how classified. 

6. Know the nominal and the actual dimensions of lumber. 

7. Know how veneer and plywood are made, and their uses. 

B, Finishes. 

1. The object of finishes. 

2. The kinds of finishes in common use; such as stain, oil, wax, shellac, 
varnish, lacquer, enamel, paint. 

3. The durability of different finishes. 

4. The conditions or places in which various lands of finishes may be used 
to advantage. 

5. Materials from which finishes are made. 

G. Glue. 

1. The kinds of glue. 

2. The preparation of glue. 

3. The conditions and lequirements in use. 

D. Nails, brads, and fasteners 

1. The kinds of nails. 

2. The uses of the different kinds. 

3. The size of nails, 

4. How nails are sold. 

5. How nails are manufactured 

6. Sizes of brads and how sold. 

7. Size, kinds, and uses of corrugated fasteners. 

8. Sizes and uses of clamp-nails. 

E. Screws. 

1. The kinds of screws. 

2. The uses of the different kinds. 

3. How the sizes and kinds of screws are indicated. 

4. How they are sold, 

F. Sandpaper and steel-wool. 

1. The kinds of sandpaper. 

2. Grades of sandpaper. 

3. Principal uses. 

4. Grades and uses of steel-wool. 

G. Design of furniture. 

1. Is it adapted to the use for which it is indicated? 

2. Is it structurally good? 

3. Is it well made? 

4. Are the structural members in good proportion? 

5. Does it have an appearance of stability? 

6. Is the structure as a whole well-proportioned? 

7. Are the outlines pleasmg? 

8. Is it well finished with an appropriate finish? 
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H. Manufacture of wood products. 

1 . The location of important manufacturing concerns. 

2 The division of labor in industry. 

3 The use of automatic machinery. 

I. Joints. 

1. Types of ]ointa, where used, and why. 

J. Hardware. 

1 , Types of hinges and thoir u.ses. 

2 Types of latches and where used. 

3, Types of locks and where used. 

4. Types of nails and where used. 

5^. Special types of fittings, 

K. Abrasives. 

1. Kinds of grinding and sharpening stones, their grades and uses, 

The American Vocational Association committee on standards of 
accomplishment in industrial arts has aptly referred to knowledge of 
the informational type as “the things you should know.” The ex¬ 
ample adapted from the committee’s report are not given as neces¬ 
sarily complete either by the committee or the present authors, but 
they are suggestive of the type of information that can be tested by 
the customary measurement techniques. The construction of devices 
for the measuremont of information is discussed and illustrated in 
Chapter XI. 

Quality. The quality of a project depends on how well tlio various 
tool operations have been executed, assuming, of course, a constant 
quality of material It is the composite result of the type of material 
used and the skill with which it was worked. Quality involves such 
factors as squareness, roimdness, finish, fasteners, exactness of dimen¬ 
sions, accurate placement of parts, etc. To secure a reliable measure 
of quality, it is usually necessary to employ a performance test. The 
pupil or pupils being tested make a standard project which gives 
samples of their work with different tools, instruments, or machines 
The results obtained are then rated on suitable scales of quality and 
thus the teacher is able to obtain a reasonably objective estimate of 
quality of work. A further discussion of the measurement of quality 
is given in Chapters XI and XIII, 

Quality may be scored by physical measurement, by the use of 
scales, and by general observations of the student’s procedure and 
product. The following arc examples of measurable qualities from 
several industrial education subjects The examples reported in Table 
17 are taken from courses of study, committee reports, and the results 
of job analysis. They are not given as complete but should prove 
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suggestive for test workers and teachers who may desire to construct 
performance tests. 

Technique, In general, the individual who manipulates the tools, 
machines, or instruments in the shop or at his desk in the most direct 
and efficient manner has the best technique. A pupil with good tech¬ 
nique can improve his skill to the maximum of his ability by practice. 
Before technique can be measured it is necessary to know just what 
constitutes the best technique for using different tools and machines 

TABLE 17 

Examples of Operations That Determine Quality in Industbiai, Education 

SuDJECTS 


Subject 


Opciations 


1 . 

2 . 

3. 


4. 


5, 


Woodwork Planing, joining, nailing, screwing, sawing, uiGasuring, shav¬ 

ing, scraping, boring, cUampfenng, chiseling, gouging, filing, 
gluing. 


Auto meoliaaioB 


Drawing 


Electricity 


Home mechanics 


Tighten bolts, install cotter pms, solder, measure in. thou¬ 
sandths of an inch, fit bu.sliiiig, fit bearings, fit rings fit 
pistons, time motor, clean, gioase, 

Closing corners, lettering, numbering, placement of diaw- 
ing on page, measuring, neatness, dimensioning, circles, el¬ 
lipse, inegular figures. 

Skinning wire, soldering, splicing, insulating, cutting who, 
bending and straightening wiie, installing conduit, insLall- 
mg binding posts, switch installation, drilling metal, con- 
Crete and brick, boring wood. 

Soldeimg, cutting wire, splicing, insulating, sldnnmg wire, 
attaching wire to binding posts; drilling metal, brick, and 
concrete; planing, boring wood; chiseling, nailing, screw¬ 
ing, sawing, cutting pipe, tightening pipe joints, tightening 
belts, splicing belts, filing; applying varnish, enamel, and 
paint. 


for different purposes. To measure technique, therefore, it is neces¬ 
sary to emphasize the best shop procedures for the tools used in the 
course and then check the technique of the pupil being tested through 
observation Tests of this type are especially helpful in diagnosing 
pupil ^fficulties m the manipulative phases of the course of instruc- 
tion. Tests of technique should prove of special value in certain types 
01 trade- and continuation-school classes. 

by the time it takes to do a job with 
the quality held at a standard. Unless the quality is held constant. 
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the time required to do a piece of work is not a suitable basis for deter¬ 
mining speed. Speed has considerable importance in trade courses, 
but in cultural courses of the iunior-high-school type it is of much 
less significance. If a student has good technique he can develop his 
maximum speed through practice. At the present time the authors 
do not know of any suitable tests of speed which arc available for 
general use in industrial education. 

Beading Technical Symbols. It is necessary to read drawings and 
interpret various types of symbols in industrial education. In life and 
m school it is often of more practical use to be able to read symbols 
than to make them. In a trade course where the objective is strictly] 
vocational, both the making and reading of symbols may be of im-J 
portance. In order to get an adequate measure of a pupil’s ability 
to read drawings or symbols, it is necessary to construct objective 
tests which give the pupil an opportunity to read enough drawings 
to determine his ability in this respect. No adequate tests of this 
type have appeared thus far for general use, but sufficient work has 
been done to demonstrate their feasibility. 

Reading. Teachers of industrial education should quickly discover 
the reading abilities and limitations of their students. Those who are 
defective in reading should be given remedial help designed to over¬ 
come the difficulty, rather than allow them to be penalized by a poor 
mark because they are unable to understand the printed instructions. 
Since written instruction sheets are coming into common use in in¬ 
dustrial education subjects, reading is especially important, A num¬ 
ber of excellent tests of reading ability are available. 

Spelling. More written work is found in certain phases of shop 
work than formerly, and the shop teacher should assume his share of 
the responsibility for the elimination of spelling difficulties. Certain 
technical terms should be mastered by the pupil so that he has no diffi¬ 
culty in pronouncing or spelling them correctly. Such training should 
be included as part of the regular course. Considerable research has 
been done in spelling, and many good spelling tests are available for 
determining spelling difficulties. Suitable spelling tests can easily be 
constructed for the purpose of measuring the pupils' mastery of the 
technical words used in the respective courses. Industrial education 
teachers will do well to make the pupil’s spelling ability entirely dis¬ 
tinct from achievement in the course. 

Mathematics. Many shop courses involve related mathematics. 
The student usually has some mathematical background when he 
comes into the shop, but often many details have been forgotten. 
Accordingly, there are ordinarily wide differences in the abilities of 
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the pupils in the class. Some may be able to do the related mathe¬ 
matics -whereas others may need remedial instruction. A fe-w tests 
in shop mathematics have appeared, but they have been more general 
than specific. There is still a need for tests which treat the mathe¬ 
matics needed for special industrial education subjects. Some indus¬ 
trial subjects in which related diagnostic tests in mathematics may 
be used effectively are printing, electricity, woodwork, auto mechanics, 
sheet metal, and machine shop. 

Appreciation oj Industrial Products. One of the major objectives 
of junior-high-scliool industrial arts is the development of the con¬ 
sumer's appreciation of industrial products. The majority of people 
are more frequently buyers of industrial commodities than they are 
the actual producers. This means that an appreciation of the prod¬ 
ucts of industry and the trades is important and should be measured. 
Little usable material is now available for the measm’ement of this 
factor. It, therefore, presents an unusual challenge to test workers in 
industrial education. Psychologically, the selection of an article in¬ 
volves the making of a judgment based on the composite of several 
variable factors. For example, if it is desired to select a kitchen chair 
the following factors might come into consideration: material, cost^ 
utility, designs, weight, strength, and finish. It is obvious that a con¬ 
sumer must have a knowledge of the qualities of the commodity, and 
some experience in evaluating them, before an adequate selectioti can 
be made. At the present time a rating scale seems the most satisfac¬ 
tory means of developing and measm-ing consumers’ appreciation. 
Specific suggestions on the construction of such rating scales are given 
in Chapter XII, 


Planning. Planning involves the ability to map out a direct and 
effective method for doing a job. It is generally conceded that, before 
a workman can plan the best method for doing a job, ho must have 
an understanding of the factors involved. Ability to plan is now gen¬ 
erally given as one of the desirable objectives to be achieved m in¬ 
dustrial education courses. Individual differences are as great in 
ability to plan a project as in ability in other directions. Some of 
tins difference m ability to plan is due to lack of training and some 
to native capacity The habit of attacking problems in an orderly 
manner is a valuable one in any type of occupation, and industrial 
education work offers many opportunities for its exercise. Here again 
no suitable tests of planning have appeared, although they would be 

tudents ability to plan, as it relates to a single subject may be 
secure y giving a number of situations which require the' formula- 
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tion of a plan. The plans proposed by the students may then be com¬ 
pared with an ideal solution and with proposals of other pupils having 
similar backgrounds. 

Language. The industrial education teacher needs to give atten¬ 
tion to the language difiBculties of his students in order to be of greater 
service to them in developing desirable oral and written language 
habits. The language of industrial education courses involves essen¬ 
tially the same principles as those governing correct usage in any 
subject- This enables the industrial education teacher to make use 
of available diagnostic tests for determining language difficulties. 
Here, as in spelling, teachers of industrial education should distinguish 
between achievement in industrial education courses and development 
in the use of language. A pupil's achievement in an industrial educa¬ 
tion subject should not be penalized because of language errors. 
Achievement in a course in woodworking is one thing, and the mis¬ 
takes a student makes in language are quite another. The wise and 
sympathetic teacher will give the pupil special help designed to over¬ 
come his language difficulties, rather than penalize him by lowering 
his mark in industrial education achievement. 

Inventiveness. Psychologically, inventiveness is similar to plan¬ 
ning, in that it involves the formulation of a plan or series of plans. 
It obviously requires a higher type of mental ability, even to the ex¬ 
tent of demanding abstract thinking with a dash of constructive imagi¬ 
nation thrown in. Every shop teacher has met the boy who believes 
he has the solution of perpetual motion, or who knows he can improve 
available shop equipment by making certain changes in the machines. 
The authors have had several experiences in which boys sanding wood 
by hand near the wood lathes have conceived the idea of making a 
cylindrical sander by using the lathe. Probably not one of these boys 
had ever seen a power sander, and would have been greatly surprised 
as well as disappointed to learn that their invention had been used 
for several generations. So far, no adequate test of inventiveness has 
been developed. 

Personality Traits. In recent years much consideration has been 
given to the significance of personality or character traits. The 
American Vocational Association committee on standards of accom¬ 
plishment in industrial arts ^ refers to this phase of the work as “what 
you should be,” and suggests the following traits as being worthy of 
development because of their recognition as essential to success in life: 
industry, cooperation, consideration of others, self-reliance, and readi- 

2 Standards of Attainment in Industrial Arts Teaching, Bulletin iff the Ameri¬ 
can Vocational Association, New York, December 12, 1931, pp. 21. 
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ness to assume responsibility. Character development is certainly an 
important phase of education for a demociacy. Industrial education 
teachers have many opportunities to develop these traits in their 
students. Ordinarilyj personality or character traits are measured by 
means of a rating scale. Several such scales have been developed for 
general use, but no published tests have so far appeared for rating 
pupils in the shop. 

Mechanical Aptitude. Mechanical aptitude may be thought of as 
the capacity of a pupil to deal successfully with mechanical devices. 
Mechanical aptitude is now generally recognized as a measurable qual¬ 
ity. It varies considerably among individuals and, in general, has a 
low correlation with intelligence scores. A knowledge of a pupil’s 
mechanical ability is of value in assigning projects and in giving 
guidance suggestions for industrial vocations. Considerable researcli 
has been done on mechanical aptitude, and several good tests arc 
available. 

Intelligence. General intelligence is commonly considered the abil¬ 
ity of an individual to learn. A knowledge of a pupil's general in¬ 
telligence is of considerable importance in teaching and in educational 
guidance. To date, more than two hundred tests of intelligence have 
been published. The majority of these have received little considera¬ 
tion because of the lack of adequate validation, or the unreliable 
measures resulting from their use. 

SUMMARY 

Industrial education teachers need precise knowledge about their 
pupils as an aid in teaching, rating, and guidance. Fifteen important 
and measurable factors in industrial education are: information, qual¬ 
ity, technique, speed, reading technical symbols, reading, spelling, 
mathematics, appreciation of industrial products, planning, language, 
inventiveness, personality traits, mechanical aptitude, and intelli¬ 
gence. Many of these measurable factors are a distinct challenge to 
test workers and teachers in industrial education. In general, the 
measurable factors of industrial education can be tested most effec¬ 
tively when they are measured individually or in a separate division 
of a test. 

SUMMARY EXERCISES FOR DISCUSSION 

1. Define and illustrate each of the measurable factors listed in Table 16. 

2. Outline a plan for measuring objectively a pupil’s ability to plan an attack 
' on an industrial education problem. 

3. State th?ee or more objective questions or exercises designed to measure in¬ 

ventiveness. 
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4 Show how reading ability is an important factor in the measureinent of 
acliieveinent in industrial education 

5, Make a list of the icclinical woids that pupils in industrial education course 
should be able to spell. 
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CHAPTER VI 


ADMINISTERING INDUSTRIAL EDUCATION TESTS 

29. Responsibility fot Giving and Scoring the Tests. 

The matter of determining the responsibility for giving and scoring 
educational tests rests chiefly upon the function the tests are expected 
to perform. If the tests are of the narrow-function type, closely paral¬ 
leling the course of study taught by the teacher, they should undoubt¬ 
edly be given by the teacher himself. If they are designed for survey 
purposes, or if the results are to be used for experimental, supervisory, 
or research purposes, they should probably be given by some one rep¬ 
resenting the administrative office of the school. Since these latter 
uses of the modern educational test are by far in the minority in most 
school systems, it is obvious that most of the classroom testing will 
be done by the classroom teacher. 

An excellent generalization for determining the responsibility for 
the administration of educational tests may be stated as follows: 
Whenever the test results are of a type to provide the teacher with a 
reliable and valid basis for the discovery of individual pupil diffi¬ 
culties in learning or achievement, they should so far as possible be 
administered and interpreted by the teacher himself; otherwise, they 
should be administered by some other school official or a disinterested 
party. The single important exception to this general policy is the 
individual mental test, which, of course, should be given by a trained 
and experienced examiner other than the classroom teacher. 

Properly selected tests for classroom use will contain so much val¬ 
uable information that the teacher, in most cases, will be robbed of a 
rich opportunity to learn about his pupils and his own instructional 
efficiency if he does not insist upon his right to score the papers him¬ 
self, This is particularly true of industrial arts subjects, in which the 
instructor will be mainly interested in the pupil’s mastery of content. 
Teachers should regard the scoring of educational tests, whether they 
be those selected and used by themselves with their own classes or 
superimposed tests, as a personal responsibility and as an opportu¬ 
nity for securing significant information which should distinctly im¬ 
prove their teaching practice. 
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30. Wken to Give Tests. 

The matter of when to use an educational test in the classroom, 
like the location of the responsibility for giving it, is determined almost 
entirely by the function the test is to perform. In the period of test 
development when the tests were not so numerous and lacked suffi¬ 
cient reliability for individual pupil analysis, the common practice was 
to administer a test at the end of the school term. This was adequate 
to give a general picture of the end-product of instruction, but it 
failed to accomplish two very important things from an educational 
point of view. In the first place, assuming that one of the very im¬ 
portant functions of school training is to bring about changes in the 
quality of pupil response or the level of mastery, such a procedure 
gives no basis for such an evaluation. Progress cannot be determined 
by end-of-Lhe-year measurement alone. In the second place, any 
weaknesses revealed by this end-of-the-year measurement are brought 
out too late to permit anything to be done about them. Remedial 
and corrective instruction under these conditions of measurement is 
impossible. Accordingly, many teachers are now making use of simi¬ 
lar forms of tests at the beginning of the year as a check on initial 
status, and again at the end of the year as a measure of the year's 
accomplishment. This type of incasm’ement permits an evaluation of 
initial status and the relative efficiency of instruction during the year, 
as well as presenting a fairly accurate picture of pupil growth in mas¬ 
tery during the period. 

The development of extensive and detailed tests based upon a much 
more critical analysis of the different fields of instruction gave rise to 
a more refined idea of the use of tests. Clearly, the use of the test 
at the beginning of the term was justified mainly by the fact that 
it made it possible for educational progress or improvement to be eval¬ 
uated. The end-of-the-year test was justified largely on the same 
basis, for it certainly did not provide any adequate basis for corrective 
instruction since the pupils involved were likely to be out of the hands 
of the teachers by the time the tests themselves were corrected and 
interpreted. The next logical step in the use of tests, therefore, was 
to construct many tests measuring a rather limited unit of instruc¬ 
tional material. These tests made it possible for the teaoher to make 
an immediate check-up on the efficiency of instruction as soon as the 
teaching of a particular unit of subject-matter was completed. The 
fact that these tests were each standardized as of the end of the par¬ 
ticular instructional period involved made them especially useful to 
the teacher as the basis for organizing corrective and preventive work. 

The narrow-function unit-type tests have been slow to develop in 
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the industrial education field although several useful contributions 
have been made. For the most part these tests are not standardized, 
nor has their reliability of measurement been critically checked. How¬ 
ever, they present useful material and should point the way to addi¬ 
tional contributions in this field. The validity of the tests has been 
determined with much more care than has the reliability. It is prob¬ 
ably safe to assume, however, that the reliability, low as it may be in 
certain cases, is much better than the teacher’s subjective judgment. 

The Nash-Van Duzee Instructional Review' Test in Mechanical 
Drawing ^ is a good example of a senes of short tests based on a care¬ 
fully validated group of instructional divisions. 

These tests include the following subject-matter units which were 
selected after an analysis of textbooks, courses of study, and drawing 
teachers’ judginenls; drawing instruments and their use, terms and 
definitions, lettering, orthographic drawings, working drawing, sec¬ 
tions, graphs, inking technique, construction problems, developments, 
materials, screw threads, conventions, fastenings, pictorial drawing, 
architectural drawing, gears, cams, detailed drawing, and assembly 
drawing, 

The Hunter Shop Tests ® also have value in measuring a number of 
instructional units in different phases of industrial education. How¬ 
ever, the validity has not been critically determined by a check of 
representative courses of study and cannot be used for the detailed 
analysis that is possible with the Nash~Van Duzee Instructional Re¬ 
view Tests in Mechanical Drawing. 

It is thus clear that the program of test administration in this, as in 
other instructional fields, is largely conditioned by the use to be made 
of the results. In general, if the function is mainly that of a survey 
for the purpose of comparing school with school, or class with class, 
the use of the test at the end of the school year may be adequate. 
If evaluation of pupil development in mastery or achievement is the 
object, then cross-sections at the beginning and at the end of the 
instructional period should be taken. If an objective basis for imme¬ 
diate pupil adjustment through remedial and corrective instruction is 
the major purpose, then the narrow-function unit-type tests must be 
used systematically throughout the year. Naturally, this type of test¬ 
ing program is fairly expensive in terms of time, pupil-tcacher effort, 
and financial outlay. However, there are serious doubts whether in 
the last analysis any other type of testing program can be justified. 

^ Nash, Harry B., and Van Duzee, R. R,, Insinictional Review Tests in 
Mechanical Drawing, Bruce Publishing Company, Milwaukee, Wisconsin 1930 

Hunter, Wm, H„ HurUer Shop Tests, Manual Arts Press, Peoria, Illinois. 
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The teacher has a right to expect a tangible return in the form of 
supervisory suggestions and remedial helps for his class for any time 
spent in taking, giving, scoring, or interpreting educational teats. 

31. Controlling the Variables in Testing. 

Most modern tests on the informational aspects of industrial edu¬ 
cation are constructed in such ways that almost any shop teacher who 
is reasonably skillful in maintaining the discipline of his class and who 
will follow the directions accompanying the tests can administer them 
without difficulty. It is always desirable, of course, that the teacher 
should become very familiar with the examiner’s manual before at¬ 
tempting to give any kind of test. If the examiner is inexperienced 
in the giving of tests lie should try the test out on someone before 
giving it to his class. If this is impossible, the test itself and the_direc¬ 
tions for administering it should be read through several times before 
attempting to give it to a class. 

The following general suggestions may be useful to the individual 
not widely experienced in the administration of tests: 

1. Before beginning the tests have the desks cleared and see that 
each pupil is provided with one or more pencils. Have a num¬ 
ber of extra pencils available for emergencies. 

2. The room should be quiet throughout the tests. Require strict 
attention to the directions, and see that the pupils follow your 
commands at once. If the group tested is large, additional 
proctors may be necessary. They should move quietly about 
the room and see that all pupils get started correctly and 
together. 

3. The examiner (and proctors) should pass down the aisles and 
place a test booklet on the desk of each pupil with the cover 
page (page 1) facing the pupil. If the tests are in mimeo¬ 
graphed form, place the folders face down and instruct pupils 
to leave them in that position until they are told to fill in the 
blanks at the top of the first page. 

4. All directions to the pupils should be given carefully in a tone 
which carries proper emphasis and suggests authority. The 
voice should be just loud enough to be heard in all parts of the 
room used for testing. 

5. Follow the directions of each test strictly, and adhere rigidly 
to the time limits. A stopwatch is highly desirable for timing 
the tests. 
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6 See that all pupils start and stop instantly upon the signal. 
Students should be instructed that, should they finish a test 
before time is called, they may go over their work and look 
for mistakes. 

32. Administering Manipulative Tests. 

Manipulative or performance tests present problems of administra¬ 
tion which difier in certain respects from the objective pcncil-and- 
paper tests of information. In a performance test the pupil modifies 
materials with tools, makes a drawing with instruments, or applies a 
coating to a surface. The performance test is a recognition test in 
which the pupil selects the tools or instruments which are already 
available and turns out a product which can be rated objectively and 
compared with other, similar products. 

Like tests of the paper-and-pencil variety, manipulative tests usu¬ 
ally have prepared sets of directions which must be studied by the 
teacher and carefully followed. It is also advisable for the teacher 
to practice giving a performance test to individual pupils or to a small 
group before attempting to give it to an entire class. Manipulative 
tests require very careful supervision in order that the resulting prod¬ 
uct may be uniform and thus lend itself to objective rating. 

The autliors have found the following points helpful in administer¬ 
ing manipulative tests: 

1, Read aloud and distinctly the directions to pupils while the 
class follows silently. 

2. Answer all questions about the directions before the test is 
begun. 

3 Show the pupils a completed test-product, and, if they care to, 
let them examine it. ' 

4. If there are no further questions, say, “Get ready. First tool 
or instrument up. Go.” (Record time.) 

5. During the examination answer any questions about the steps 
in the procedure by reading the steps in question with the 
pupil. 

6. Observe the pupils as they work to make certain they are all 
doing the steps in the correct order. 

7. Make certain that the proper tool or instrument is used where 
indicated, but do not tell the pupils how to use the tool. 

8. Help any pupil having trouble m reading working drawings but 

do not make measurements on his problem for him. ’ 
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Other factors which the authors have found important to consider 
in administering a manipulative test are: (1) the condition and place¬ 
ment of tools or instruments, (2) quality of materials, (3) lighting, 
and (4) convenience and comfort of place to work. The tools used 
must be of suitable size, properly adjusted, and uniformly sharp for 
each student taking the test. The materials to be used should be of 
uniform quality and free from defects. If one pupil has a piece of 
knotty oak and another one a piece of clear gum-wood, it is obvious 
that the two pupils are not working under comparable conditions and 
have not the same chance to obtain good results. It is conceivable 
that the pupil with the knotty oak might do better work and yet get 
a lower rating. Proper lighting is also very important in a perform¬ 
ance test, because it is necessary for the pupils to get clear images of 
their work. A suitable light for a manipulative test is 12 to 15 foot- 
candles of illumination on the bench top or drafting-board surface. It 
IS also well to be certain that all pupils taking a manipulative test 
have normal eyesight. The benches or drafting boards should be of 
proper height for the individuals taking the test. It Loo frequently 
happens that extremely tall or short pupils are handicapped by too 
low or too high a working surface. 

33. Scoring Manipulative Tests. 

Manipulative tests arc scored by physical measurements, rating 
scales, and observation of experts. It is obvious from this statement 
that the products of manipulative tests cannot be scored as objectively 
as the pencil-and-paper tests. However, the scoring is much more ob¬ 
jective than the teacher’s subjective judgment, and usually reasonable 
objectivity can be secured. The most objective scores are obtained 
where physical measurements can be used. If a pupil cuts a board 
131/2 inches long when it should have been 14 inches, the board can 
be measured and the amount of the error determined in fractions of an 
inch. This is as objective as the response to a true-false question. 
The rating of a soldered joint, a rope splice, a glued joint, the setting 
of a flat-head screw, or the tying of an underwriter’s knot are modifi¬ 
cations of materials, but they are the result of an almost limitless 
number of small variables which produce a quality of workmanship 
that is not easily measured by instruments or readily judged by ex¬ 
perts without means of comparsion Since this is true, joints, splices, 
lettering, etc., can be rated most objectively by comparing them with 
samples of known quality. This rating process is referred to as using 
a rating scale. The construction and use of rating scales are discussed 
in detail in Chapter XII. 
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There are other characteristics of the results of manipulative tests 
which do not lend themselves to physical measurement, but which can 
be rated quite objectively by experienced observers. Examples of 
these are letters transposed in printing, or loose wires in electrical cir¬ 
cuits, The important thing in achieving objectivity in scoring manipu¬ 
lative tests is to use the techniques of physical measurement, quality 
scales, and experienced observers in rating those results to which they 
are adapted. If this is done the objectivity of the scoring of manipula¬ 
tive tests may be made quite satisfactory. 

34. Responsibility for Training in the Use of Scales. 

The use of scales for the more or less objective rating of qualities 
or products has been pointed out as an important phase of measure¬ 
ment in industrial education. In fact, many outcomes of shop work 
are measurable in no other way than by rating scales. For example, 
tho quality of a soldered joint in metal work or of a mitered joint m 
woodwork is not readily evaluated objectively except through the use 
of a scale. Experience with such rating scales makes it apparent, 
however, that reliable results cannot be obtained from their use by un¬ 
trained and inexperienced judges. Brief courses of training in the use 
of the scales result in distinctly reducing the unreliability of measure¬ 
ment resulting from the subjective factors. Classroom teachers can be 
trained in the use of handwriting scales, freehand drawing scales, com¬ 
position scales, and doubtless many other kinds of scales, to the point 
where the average error or deviation from a known quality rating will 
not exceed five points on a hundred-unit scale. This is probably not a 
serious inaccuracy in such measurement. It may be inferred therefore 
that similar training periods must be provided for industrial education 
teachers who are desirous of using rating scales in this field. Increased 
reliability of measurement may be expected as a result of such 
training. 

The head of the department of industrial education in a large 
high school may well assume the responsibility for giving his teachers 
a brief course of training in the use of rating scales. Typical samples 
taken from the chosen field may be used for this practice. Preferably 
samples representing a wide range of quality should be chosen. If 
the samples are selected from the products of a class and the true 
quality scores of the samples are not known, the average ratings given 
by a group of six or seven teachers may be taken as the basis for 
adjustment. Judges whose ratings deviate most widely from the com¬ 
posite or the^ true values should be asked to rerate their samples 
making certain conscious adjustments in their mental standards of 



SUMMARY 


61 


quality until they conform quite closely to the standards of the group. 
Considerable experience in working with training-groups in the use of 
such scales indicates that certain individuals readily adjust their ideas 
or standards of quality to those of the scale. There is some slight evi¬ 
dence that such individuals, being gifted with greater discriminative 
power, are usually found among the more able groups of teachers. 
For such individuals a brief period of training is adequate. For the 
average teacher, inexperienced in the use of such scales, as many as 
two or three hourly periods of drill in the rating of the selected speci¬ 
mens may be necessary before a satisfactory level of accuracy of 
measurement is reached. 

35. Scoring Pencil-and-Paper Tests. 

One of the important distinguishing features which character¬ 
izes objective paper-and-pencil tests is their very objectivity. Ob¬ 
jectivity in a test implies little or no variability in the acceptable 
answers. Objective tests should be scored in exact accordance with 
the scoring key. The directions should be followed rigorously and 
the tests scored exactly according to instructions, even though they 
may run counter to the user’s best judgment. Unless this care in 
scoring the tests is taken, it is impossible and improper to make com¬ 
parisons of the test results with the norms or standards which have 
been derived under controlled conditions. Errors in scoring and tran¬ 
scribing test scores are best eliminated by rechecking all such work 
and by performing all related calculations at least twice. Special 
care should be taken where the results are to be used for experimental 
purposes or for individual pupil analysis. 

The remaining phases of the administration of tests in the class¬ 
room are essentially statistical and interpretational in character and 
as such are reserved for discussion in Chapters XIV and XV. 

SUMMARY 

The matter of determining the responsibility for the giving and 
scoring of educational tests rests to a large degree upon the use to be 
made of the test results. The questions of when to use an educational 
test and what kind of a test to use are answered almost entirely by 
the function the test is to perform. 

Tests in the industrial education .field arc broadly divided into (1) 
tests of information and (2) tests of performance. Information Lests 
are commonly of the paper-and-pencil variety, calling for evidence of 
a mental reaction. Performance tests are manipulative and construc¬ 
tive in character, calling upon the student to apply tools and skills 
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to materials, and produce tangible objects of varying quality in ac¬ 
cordance with certain definite specifications. Conditions under which 
tests of both types arc administered must be carefully controlled if the 
results are to be meaningful, 


SUMMARY EXERCISES FOR DISCUSSION 

1. What should be the classroom teacher's responsibility for the administratioir 

and interpretation of industrial education tests? 

2. What fiictois determme primarily -what tests to give and when to give thciri? 

3. In what specific points does the adininistuition of the objective test of the 

papci'-and-pcncil type differ from that of the manipulative type of industrial 
arts teat? 

i. How may the scoring of manipulative tests be objeclifiod’ 

5. What docs the evidence show regarding the influence of rnlang scales for shop 
products on tlio reliability of marks assigned? Docs trammg in the use of 
such scales appear to pay? 
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INDUSTRIAL EDUCATION ACHIEVEMENT TESTS 

Selected teats for certain fields of industrial education are described 
and evaluated in this chapter. In general, the tests named have been 
published and widely distributed during the past few years. Many 
are quite satisfactory and are in most respects the equal of good tests 
in other fields, but quite a number are pioneer efforts and are in¬ 
cluded more because they arc suggestive for future development along 
more scientific lines than for their present merit. Many more care¬ 
fully prepared standardized tests in industrial education are needed 
before the field will be as well covered as other fields of instruction. 
No attempt has been made to include all the available tests, but tests 
which seem to have special values for industrial education teachers 
have been selected from the different instructional fields. 

36. Achievement Tests in Industrial Education. 

In order to measure achievement in industrial education, it is neces¬ 
sary to measure information and ability to perform tasks involving 
the use of tools, machines, and materials. Ability to perform a task 
does correlate with knowledge, but the relationship does not seem to 
be sufficiently close to warrant the use of the pencil-and-paper-type 
test to measure all types of achievement in industrial education. For 
example, a pupil may know how to do a job and be able to do it if 
given the opportunity, and yet make a poor score on a pencil-and- 
paper test because he does not know the technical vocabulary. An¬ 
other pupil may know the procedure and the vocabulary but lack the 
tool skill necessary for the execution of the project. 

The fact that the work in industrial education is not well stand¬ 
ardized from school to school has been pointed out by many test 
workers as an insurmountable obstacle in the way of the construction 
of standardized industrial education tests, and it has been contended 
that further test construction should wmit until the work is more defi¬ 
nitely standardized. On the surface this seems logical, hut in fact it 
has little foundation because, in other fields of instruction, the research 
work needed to validate a test has been one of the chief influences 
tending to standardize curricular content and establish levels of ac- 
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complishment. In industrial education, as in other subjects, it is 
necessary to make a careful study of teaching practice, textbooks, 
courses of study, committee reports, and, in many cases, to make ex¬ 
tensive analysis of the subject to be tested. All these studies lead 
toward a better understanding of the content in any of the subjects 
selected for test construction. These validation studies have a marked 
influence on teaching practice because they are put in the form of a 
test and the teacher can determine in part whether or not his course 
is valid by comparing it with the items in the test and the median 
results obtained with those of other schools. It is obvious, therefore, 
that standardized tests of achievement witli their attending validation 
studies arc one of the strongest influences tending to define and set 
up standards of accomplishment in industrial education courses. It is 
of course difficult to establish norms that are of great value, but norms 
can be revised as the work becomes more uniform through the use of 
validation studies. 

37. Standardized Industrial Arts Tests. 

In this section four widely used standardized industrial arts tests 
are briefly described. 

1. Nash- Van Duzee Woodwork Test 1, Scale A^ 

This is a test designed to measure the junior- and senior-high- 
school pupil’s knowledge of processes, tools, materials, and information 
used in woodworking. Five different types of questions are used in 
the test with appropriate directions for each type. The test is printed 
in a neat, eleven-page booklet and is accompanied by a manual of 
directions, objective scoring key, and class record card. In general 
the test has been carefully constructed. 

Validity. The validity of the test was based on an analysis of 
teaching content as obtained from courses of study, textbooks, surveys, 
reference books, and trade analyses. Apparently the validity of the 
test is satisfactory in the light of common practice. 

Reliability. The coefficient of reliability based on 200 cases was 
found to be 86, using the chance-half method and employing the 
Spearman-Brown formula for estimation of the reliability of the whole 
test. The reliability of the separate divisions of the test on the same 
number of cases is given in Table 18. Although the reliability of this 
test is a little below the best academic tests of the same type, it is 
very satisfactory for measuring achievement for comparative purposes. 

1 Nash, Harry B, and Van Duzee, Roy R., Woodwork Test 1, Scale A, Bruce 
Publishing Company, Milwaukee, 1927. 
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Norms. Norms are reported on 3000 cases. They are given on the 
basis of semesters and number of minutes of instruction from the first 
semester of the junior high school through the first two semesters of 

TABLE 18 


Reliability Coefficients fob Nash-Van Duzee Woodwork Test 


Part 

Reliability Coefficient 

I 

A 

.61 

I 

D 

SO 

I 

Total 

.85 

II 


80 

III 


.94 

IV 


.88 


high school. A few cases are also reported on training-school stu¬ 
dents. The norms are likewise given in the form of percentiles with 
a corresponding marking scale. The methods employed in securing 
and reporting these norms should be very suggestive to other test 
workers in industrial education. 

2. Nash-Van Duzee Woodwoek Test 1, Scale B ^ 

This is a test for the purpose of measuring the pupil’s skill in ma¬ 
nipulating hand woodworking tools. It is a companion test of Test 1, 
Scale A, which is a test of information rather than performance. The 
test is suitable for measuring manipulative achievement in junior- and 
senior-high-school hand woodwork. Nash and Van Duzee ” state in 
the manual that “the test aims to measure tlie pupil’s understanding 
of directions involving frequently used woodworking processes and 
procedures, the reading of a working drawing, the selection of proper 
tools to carry out the specified work and the ability to use the tools 
selected to do the required work." 

Validity. The skills for the test were selected after analyzing 
courses of study, textbooks, problem books, and blueprints used in 
junior and senior high schools. The items selected are representative 
of general practice, but there is a question as to whether enough of 
each type of item is given to really sample the pupil’s ability. 

^Nash, Hany B., and Van Duzee, Roy R., Woodwork Test 1, Scale B, Bruce 
Publishing Company, Milwaukee, Wisconsin, 1928. 

^ Nash, Harry B., and Van Duzee, Roy R., Manual oj Directions Industrial 
Arts Test 1, Scale B, Bruce Publishing Company, Milwaukee, Wisconsin. 
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Reliability. The reliability of this test is reported as varying from 
.60 to .80 with, an average of about .73. This is higher than a 
teacher’s subjective judgment but is probably too low for a first-class 
standardized test. 

Norms. Median and percentile norms based on a few hundred 
cases are available. 


3. NBWKmK-STODDAHD HoME MECHANICS TeST 

The chief purpose of the Nmkirk-Stoddard Home Mechanics Test 
is to measure in an objective and analytical manner the essential 
knowledge that the pupils should acquire from a well-organized course 
in home mechanics. The test is divided into two closely equivalent 
forms, A and B. Each form contains 36 jobs, comprising a test of 
half the outstanding jobs in home mechanics. It is divided into 
Forms A and B so that it will be easier to administer and will more 
nearly fit the various needs of home mechanics teachers. 

Validity. Four criteria were used to establish the validity of the 
test; 

1. Surveys to chock the jobs on the basis of social utility. 

2. Surveys of tho actual teaching content of 75 representative 
schools. 

3. Analysis of course of study and widely used commercial job 
sheets which were based on surveys. 

4. Selection of jobs with procedures representative of a class of 
jobs rather than just a single job. 


Scoring. The test is objective in its scoring. Each form has a 
printed key in which the correct responses for Part I are placed around 
the margins and the coiTecfc diagrams for Part II are reproduced on 
the back. This scoring key contains full directions for its use. The 
key sheet requires no cutting and is easy to use. No corrections for 
chance are required. 

In Tables 19 and 20 are given statistical measures which indicate 
the consistency of measurement by the test. 

‘Newkirk, Louis V., and Stoddard, George D„ Newkirk-Sloddard Homo 
Mechanics Test, Bureau of Educational Research and Service, University of 
Iowa, Iowa City, 1628. 
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TABLE IS 

Rbliadility or a Single Form (A or B) 40 Minutes' Testing 


Grade 

No.ia 

Sample 

r 

Standard 

Deviation 

P E. Score 

Job 

Point 

Job 

Point 

Job 

Point 

7 


.49 


2.5 

26 

12 

6.6 

s 


.64 


3.0 

29 

1.2 

65 

9 


59 


32 

24 

14 

6.3 

7-9 


54 


2.9 

27 

13 

6S 


TABLE 20 


Reliability of Both Forms (A + B) SO Minutes’ Testing 


Grade 

No. in 
Sample 

r 

Standard 
j Deviation 

P.E. Score 

Job 

Point 



Job 

Point 

7 

50 

66 

.92 

4.5 

60 

16 

95 

8 

50 

.78 

.94 

55 

60 

1.5 

9.3 

9 

50 

75 

92 

57 

45 

19 

8.6 

7-9 

150 

70 

93 

5.4 

53 

2,0 

9.4 


Norvis. Table 21 recapitulates preliminary norms obtained from 
grades 7^ 8, and 9. 


TABLE 21 

Norms, Forms A and B, Mat Testing (,N = 390) 



Jobs 

Points 

Form A 

Form B 

1 

Form A 

i 

Form B 

Mean . 

6.2 

5.6 

74.7 

74.6 

Median . 

4.8 

52 

77.9 

73 0 

Upper quartile. 

7.0 

76 

95.8 

96.3 

Lower quartile. 


3.1 

60.5 

54.6 

Standard deviation . 


3.4 

251 

27.9 
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4, Wells-Laubach Industhial Arts Tests “ 

Tests in woodwork, printing, machine shop, and mechanical draw¬ 
ing are included in this series. All the tests are of the pencil-and- 
papcr type and with the exception of one in mechanical drawing are 
made up of 100 true-false statements. Twenty-five minutes is the 
working time for the woodwork and printing tests, 20 minutes for the 
machine shop, and 30 minutes for mechanical drawing. 

Validity. The content of the tests parallels teaching practice in 
a general way but no scientific means of determining validity is re¬ 
ported. 

Reliability and Norms. No statistical data are given on the tests, 
but tentative norms based on the median accomplishment in about 
1000 cases are reported. The tentative norms are given on the basis of 
four semesters for each of the four tests. 

38. Noil-Standardized Industrial Arts Tests. 

1. Hunter Shop Tests, Series 1 and 2 “ 

Hunter has developed 32 short objective tests. Each test includes 
25 objective questions on the particular subject measured. The fol¬ 
lowing is a list of these objective tests according to the subjects or 
parts of subjects tested. They are of the pencil-and-paper type. 

Wood WORK 

W-l Tools test 

W-i Fastenings test 

W-3 Comprehension test 

W-i Trade names test 

W-5 Reading test for rule or scale 

W-6 True-false test 

W-7 Completion test 

W-8 Building parts test for carpentry 

W-9 Board measure test 

W-10 Multiple-choice test for wood and lumber 
W-11 Reading test for framing square 
W-12 Objective test for wood finishing 
W—13 Multiple-choice test for carpentry 
W-14 Multiple-choice test for pattern makers 

Mechanical Drawing 
MD-1 Reading test 
MD-2 Missing-lme test 
MD-3 Lettering test 
MD-4 True-false test 

^ The Manual Arts Press, Peoiia, Illinois, 1928. 

“The Manual Arts Press, Peoria, Illinois, 1927. 
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Machine Shop 

MS-1 Tool test 

MS-2 Comprehension test 

MS-3 Tiue-false test 

MS-4; Micrometer reading tost 

MS-S Multiple-choice test 

Busotbic Shop 
E-1 Symbols test 
E-2 Objective test 

Automobile Mechanics 
AM-1 Parts tost 
AM-2 Multiple-choice test 

Printinq 

P-1 Completion test 

Related Subjects 
G-1 Shop English 
G-2 Shop mathomatica 
G-3 Shop arithmetic 
G-4 Geometiy 

Validity. The validity has been determined in a general way from 
teaching experience, pooled judgment,, and analysis of textbooks and 
courses of study. 

Beliahility and Norms. The reliability is not reported, and the 
tests arc not standardized. They are too short to have very high 
reliability individually, but in using a battery of the woodworking 
tests the authors have found coefficients of reliability as high as .85. 

39. Mechanical Drawing Tests. 

Four tests in mechanical drawing are described in this section. 

1. Badger Standard Test in Fundamental Mechanical 
Drawing, Tests 1, 2, 3 

The author states that “these are tests of what the pupil knows 
about the phases of drawing covered rather than a test of his drawing 
ability measured in terms of neatness, accuracy, lettering and so forth.” 
The test includes 145 exercises of the multiple-choice type. The form 
has been varied to fit the types of content tested. There are three 
tests. The first deals with knowledge relating to the use of instru¬ 
ments, line work, dimensioning, and lettering; the second tests knowl- 

’’ Public School Publishing Company, Bloomington, Illinois, 1929. 
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edge of projectioil and includes sections and auxiliary views; and the 
third measures knowledge of pictorial drawing, isometric, cabinet, and 
oblique. The test does not have time limits, but the directions sug¬ 
gest that the tests be collected after all but two or three of the slow¬ 
est pupils have finished. The validity and reliability of the test are 
not given in detail. 

2. Castle Mechanical Deawing Test® 

This test is divided into five subtests. Subtest 1 requires the pupil 
to identify similar parts of an object in top and side views by match¬ 
ing corresponding numbers and letters. Subtest 2 deals with dimen¬ 
sions; 3, with geometric terms; 4, with pencil technique; 5, with ink¬ 
ing. The working time for the test is 41 minutes. The first three 
parts are objective in scoring, but the last two depend to some extent 
on the teacher's subjective judgment, although the scorer is provided 
with letter rating scales and six points are mentioned for rating the 
drawing. 

Validity. No very definite statement is given as to validity, but 
it is based on analysis of instructional materials and a long teaching 
experience in mechanical drawing. 

Reliability and Norms. The coefficient of reliability is not re¬ 
ported, and norms have not been established for the test. It is not 
standardized but should - prove useful for measurement of drawing 
achievement in the same manner that a teacher-made objective test 
would be used. 

3. Fischer Mechanical Drawing Tests, Parts I and II “ 

Part I of this test covers the technical information necessary in 
drawing. No instruments other than a pencil are needed. Part II is 
a performance test and requires the use of drawing instruments. 
Either test can be given in a 45-minute drawing period. Part I is 
composed of four subtests and Part II of three subtests. Both parts 
of the test should be given since it is desirable to test information 
and performance. Parts I and II are not equivalent forms but are 
divisions of the same test. The test has considerable diagnostic value 
as it enables the teacher to see where the pupils have succeeded and 
where they have failed. The problems in the test are rated according 
to difiiculty. The test includes a manual, scoring key, and a class 
record sheet. 

® TJio Manual Arts Press, Peoria, Illinois, 1928. 

“ The Bruce Publishing Company, Milwaukee, Wisconsin, 1929, 
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Validity. The claim for validity is based on analysis of textbooks, 
blueprints, courses of study, and in addition on a survey of 100 schools 
to find out what was being taught, time being devoted to drawing, 
etc. This material was tabulated under five major divisions as fol¬ 
lows: use of instruments, lettering, projection drawing, geometric con¬ 
structions, and pictorial representations. The content included in the 
test was carefully validated on the basis of teaching practice and rep¬ 
resents good workmanship. 

Eeliahility. The coefficient of reliability was determined by giving 
the same tost twice to 150 sophomores in high school. This resulted in 
a correlation coefficient of .79. This is quite low for a standardized 
test. 

Norm,s. Median-score graphs are given which indicate the median 
score for all schools as represented by scores from 2500 students. The 
norms or medians of accomplishment are classified on the basis of 
minutes of instruction. The author also suggests moans of using bar 
diagrams and the use of test scores as a partial means of assigning 
marks. 


4. Nash-Van Dxjzee Industrial Arts Test, Test II, 
Mechanical Drawing “ 

The test is designed to measure objectively performance in draw¬ 
ing as well as information about mechanical drawing. The test is 
suitable for use in both the junior and senior high school. The test 
is divided into Part I and Part II and is available in two closely 
equivalent forms. Forms I and II were equalized on the basis of 
the results obtained from 500 mechanical drawing pupils in the ninth 
and tenth grades. A manual of directions, objective scoring key, and 
a class record sheet are provided. The scoring key also includes a 
scale for the rating of ability to letter. Part I of either form can be 
written in the ordinary classroom with a pencil, but Part II requires 
the use of mechanical drawing instruments. 

Validity and Construction. The claims for validity are based on 
the analysis of textbooks, courses of study and reference books, and a 
rating of the analysis by several hundred persons interested in teach¬ 
ing mechanical drawing. 

Reliability. The reliability of the test was found to be .87 with 
203 fifth- and sixth-semester pupils; Statistically, this is quite a sat¬ 
isfactory reliability since it was determined by correlating Form I 
with Form II. 

The Bruce Publishing Company, Milwaukee, Wisconsin. 
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Norms. Median and percentile norms indicating accomplishment 
by semesters and minutes based on 2500 cases are given for the junior 
high school and the first two years of high school. A suggestive scale 
for converting percentile norms into equivalent class marks is given 
in the table of percentile norms. Percentile curves for Forms I and II 
are used to show the approximate equivalence of the two forms of the 
test. 

40. Trade Tests. 

Trade tests are of value to industrial education teachers who teach 
vocational courses, and they are very suggestive to teachers of the 
general educational courses of the junior high school. Trade tests 
measure trade proficiency; they are valuable in selecting men who 
possess the information and skill necessary to succeed in a given trade 
and for measuring accomplishment in advanced vocational courses. 

Chapman has pointed out the significant distinctions between 
intelligence tests and trade tests. "While these two forms of the test, 
the mental test and the skill prediction test, both have a great sphere 
of usefulness in industry, it is very essential to precise thinking on 
the subject of industrial testing not to confuse these with the trade 
test proper. The trade test makes no pretense of measuring intel¬ 
ligence directly; it makes no attempt to measure the native endow¬ 
ment of the, subject, with a view to predicting the degree of success 
to be expected as a result of training in a specific trade; the trade 
test furnishes a rating, in objective quantitative terms, of the degree 
of trade ability already possessed as a result of practice in the trade.” 

Trade tests present numerous difficulties in their construction and 
for that reason have not been entirely successful, although the better 
tests are decidedly superior to subjective judgments in selecting quali¬ 
fied tradesmen. One difficulty has been the lack of information of 
test workers about abilities, techniques, skills, and attitudes necessary 
for success in a given occupation to develop valid measures. Another 
difficulty is that trade tests are not always given under trade condi¬ 
tions, with the result that a man may succeed on the test but fail on 
the job. Trade tests are also expensive of time and money. Many 
of them are individual tests and require material and tools for the 
measurement of manipulative skills. 

Trade tests of four general types—oral, picture, performance, and 
written group tests—were widely used in the army during the World 
War to select men who were proficient in the various trades. This 

Chapman, J C., Trade TesU, Henry Holt and Company, New York, Chap¬ 
ter XI, p. 374, 1921. 
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procedure saved considerable time and money. Since the war many 
industries have employed trade tests in selecting applicants for posi¬ 
tions. Tests of this type are used in vocational guidance. Trade 
tests have been greatly improved and modified during the past ten 
years and have been adapted to the needs of industry. 


SUMMARY 

Objective tests have appeared somewhat more slowly in industrial 
education than in certain other branches of instruction. Possibly this 
has been because of a lack of definiteness in the statements of the 
objectives of certain of the industrial courses. 

Achievement in the industrial subjects is not entirely a matter of 
information. Ability to perform a task does correlate with knowl¬ 
edge about the task, but this relationship does not seem to be sufB- 
ciently high to warrant the exclusive use of pencil-and-paper tests for 
the measurement of achievement in the industrial subjects. Accord¬ 
ingly, performance as well as informational types of tests are needed 
in this field. 

SUMMARY EXERCISES FOR DISCUSSION 

1. Discuss the special limitations of paper-and-poncil tests in the induatrial 

subjects. 

2. Show how the fact that industrial education courses are not well standardized 

from school to school accounts for numerous difficulties in the construction 
of tests. 

3. Select at least one test in the fields of woodworking, mechanical drawing, and 

shop work, and present the major values and limitations of each. 
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CHAPTER VIII 


INTELLIGENCE AND APTITUDE TESTS IN INDUSTRIAL 

EDUCATION 

1. MEASUREMENT OF INTELLIGENCE 

41. Meaning of Intelligence. 

The exact nature of intelligence is not well understood, but it is 
definitely known that individuals vary quite widely in mental ability 
and that within limits it can be measured. Authorities in the field of 
mental measurement are far from agreement as to what the term in¬ 
telligence implies. Some consider that intelligence is best indicated 
by the ability of the individual to solve problems, to adapt himself 
to new situations. Others hold that the abilities to perceive with speed 
and accuracy, to associate symbols, to manipulate abstract concepts, 
and to reason, are the best evidences of intelligence. Facility in the 
use of language itself is considered by some to be one of the very sig¬ 
nificant evidences of intelligence. For the purposes of this discussion 
intelligence will be considered as the capacity jor learning, plus the 
informations, skills, and attitudes which the individual has gained 
from reacting to his environment. This rather liberal conception of 
intelligence permits it to fit readily into its place in the educational 
program and also places in an acceptable light the majority of devices 
for the measurement of general mental ability. 

42. Measurement of General Mental Ability. 

Teachers and educators in general arc aware that at least two 
related but different phases of intelligence must be taken into account 
in adequate mental measurement. Certain individuals react readily 
to abstract stimuli and thus are frequently rated as normal or even 
superior on the basis of mental tests in which abstract material pre¬ 
dominates. Other types of individuals do not respond to abstractions 
but reveal unusual aptness in reacting to concrete and tangible ma¬ 
terial. Stenquisf- has stated the case for this type of pupil most 
convincingly, and has presented a very useful supplement to the 

^Stenquist, John L. “A Case for the Low I.Q./' Journal oj Educational 
Research, VoL 4; 241-54, November, 1921. 

75 



76 


INTELLIGENCE AND APTITUDE TESTS 


ordinaiy abstraot type of mental measurement in his Mechanical 
Altitude Tests. 

It is true that much remains to be learned about intelligence and 
its measurement. There are those who argue that measures of intel¬ 
ligence should not be used because its exact nature is not known. This 
argument is no more valid than the statement that electricity should 
not be measured or used because its exact nature is not known. In¬ 
telligence, like electricity, can be measured to the distinct advantage 
of society if the results are properly used. Unquestionably the scores 
from mental tests do not reveal intelligence as exactly as the dials of 
an electric meter indicate the number of watts of electricity con¬ 
sumed. Nevertheless, a few reasonably reliable and valid measures of 
intelligence are available for general use. Odell ^ states that at least 
two hundred tests of mental ability have been constructed since the 
early work of Binet, and that approximately one hundred are still 
available for use. 

43. Methods of Measuring Intelligence. 

Intelligence tests are of two general types, individual mental ex¬ 
aminations and group tests of mental ability. Individual mental ex¬ 
aminations are thought to be considerably more valid, and because of 
the method of administering them they are probably more reliable 
than group tests. Individual examinations are expensive since they 
are given to only one subject at a time and since they should prefer¬ 
ably be given only by a trained examiner well grounded in psychology. 
Much of the significance of the individual examination lies in the in¬ 
terpretations of the subject’s reactions by the examiner as the stimuli 
are presented. Group tests are easy to administer, some being almost 
self-administering. 

The problems of measuring intelligence commonly met by the 
teacher of industrial education can usually be handled satisfactorily 
by the use of carefully selected group tests. However, the group test 
results should almost certainly be supplemented by the individual 
mental examination for those having very high or very low scores and 
for problem cases. Where this is not possible, or where the problem is 
not extremely serious, the use of two or even three group mental tests i! 
is to be recommended. The average of the mental-age scores obtained 
from two or three group tests is a much more accurate measure than 
that ordinarily obtained from a single testing. 

Since the problems of the industrial education teacher will ordi- 

2 Odell, C W., Educational Measurements in High School. Chapter XV, pp. 
391. The Century Company, New York, 1930. 
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narily be solved by the use of a good group test, and since most class¬ 
room teachers are not adequately trained or experienced in the use of 
the individual examination, a brief list of excellent group tests is 
presented for detailed consideration. A short description and evalua¬ 
tion of the most widely used individual mental examinations is given 
in a later section in this chapter. 

44. Group Tests of Mental Ability. 

The four group tests of mental ability selected for description and 
evaluation here arc chosen from an extensive list of such tests. These 
tests are suitable for use in grades VII to XII, inclusive. Each test 
has been carefully validated and ranks high among such tests for 
the reliability of the test forms themselves as well as for the reliability 
of the age norms used as the basis of interpretation. These tests can 
be readily administered to a group of any number of pupils. The 
results are comparable within reasonable limits to those obtained on 
the individual examinations. 

1. Kxjhlma-nn-Anderson Intelligence Tests® 

The Kuhlmann-Anderson Intelligence Tests are the result of more 
than ten years of careful research by both authors working in the 
Research Division of the Minnesota State Board of Control. The 
thirty-nine tests comprising the battery are arranged in a scale of 
overlapping units, the net results of which closely approximate the 
results from any good individual mental examination. The tests in 
their most recently revised form are arranged in the booklets as shown 
in Table 22. 

TABLE 22 

ArHANCEMENT of KTJHl.MANN-ANDEnSON TeSTS 


School Grade 


Testa 

Age When Test 
Pits Best 

I First semester . 


1-10 

6-0 

I Second semester . 


4-13 

6-6 

II . . 


8-17 

7-6 

Ill . 


12-21 

8-6 

IV . 


15-24 

9-6 

V . 


18-27 

10-6 

VI . 


22-31 

11-6 

VII-VIII . 


25-34 

13 

IX-XII and adult . 


30-39 

15-6 


® Kuhlmann, F,, and Anderson, Rose, Kuhlmmin-Anderson Intelligence Tests, 
Educational Test Bureau, Minneapolis, Minnesota, 1927. 
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A somewhat novel procedure is used in interpreting the results of 
these tests. Each of the ten tests comprising a booklet is standardized 
separately. The test is scored in terms of the number of exercises 
answered correctly. By referring to a table of norms the mental age 
of the individuals making such a score is obtained. A mental-age 
score for each of the ten tests is thus obtained, the final mental-age 
score assigned to the pupil being the median of the resulting mental 
ages. This procedure appears to result in unusually reliable measure¬ 
ment of mental ability.^ 

2, Otis Group iNTEnLiGENCu Scale ® 

Advanced Examination, Forms A, B 

This was one of the first tests to appear for measuring intelligence 
at the secondary-school level. It has been widely used in Grades VII 
to XII inclusive. It is composed of ten divisions as follows: following 
directions, opposites, disarranged sentences, proverbs, arithmetic, 
geometric figures, analogies, similarities, narrative completion, and 
memory. The test requires more than an hour to give; it has 230 
test elements with an actual working time of about 45 minutes. The 
coefficient of reliability for grades and half grades is around .84 and 
around .97 for all grades combined. The test correlates approxi¬ 
mately 75 with a suitable criterion. 

3. Self-Administering Test of Mental Ability ® 

Higher Examination, Forms A, B , C 

This test is unique in that it requires a minimum amount of in¬ 
struction from the examiner. For this reason industrial education 
teachers will find this a very satisfactory test to use. The test has two 
time limits of 20 and 30 minutes. Generally the 30-minute time is 
satisfactory except possibly in the last years of the senior high school. 
The reliability of the test is reported as .92, and it has a high correla¬ 
tion with a valid criterion. 

<1 Kuhlmann, P , “A Median Mental Age Method of Weighting and Scaling 
Mental Tests,” Journal oj Applied Psychology, June, 1927. 

Pintner and Patteison, A Scale of Performance Tests, 1917. 

= Otis, A. S, Group Intelligence Scale, World Book Company, Yonkers-on- 
Hudson, New York, 1918. 

“ Otis, A. S., Self-Administering Test of Mental Ability, World Book Com¬ 
pany, Yonkers-on-Hudson, New York, 1922 
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4. Terman’s Group Test op Mental Ability'^ 

Forms A and B 

The Terman test is composed of ten divisions as follows; informa¬ 
tion, best-answer, word meaning, logical selection, arithmetic, sen¬ 
tence meaning, analogies, mixed sentences, classification, and number 
series'. Two approximately equal and interchangeable forms are avail¬ 
able. There are 185 items in each form of the test. It can be given in 
40 minutes, although the actual working time is only 27 minutes. 
The reliability of the test is approximately .90. It correlates .75 
with a suitable criterion of mentality. Complete tables of mental-age 
norms are given in the examiner’s manual. 

45. Individual Mental Examinations. 

Individual mental examinations probably constitute the most ac¬ 
curate devices for the measurement of intelligence. The length of the 
test, the wide variety of reactions called for, the fact that the subject 
receives his instructions personally from the examiner, and the fact 
that this affords the examiner an opportunity to observe each reaction 
of the subject all combine to account for this greater accuracy of 
measurement. However, this greater accuracy is compensated for by 
greater expense in the administration of the tests, which operates in 
terms of both time and money. In fact, it is thought by many that 
this expense item is so great that in most classroom and shop situa¬ 
tions the resulting increase in accuracy of measurement is not com¬ 
mensurate. Accordingly, it is quite probable that teachers of in¬ 
dustrial education will find it desirable to initiate their analysis of 
problem cases by first using good group tests of mental ability. In¬ 
dividual mental examinations may be given later to a relatively small 
number of pupils who deviate most widely from the normal. 

A very simple procedure will reveal directly to the teacher the 
special individuals who should receive further attention. If the group 
mental-test scores for the entire class arc tabulated in a frequency 
table, or if the test papers themselves are merely arranged in descend¬ 
ing order of size of the test scores, the individual pupils deviating most 
widely from the average for the class and from the normal mental age 
for the grade will be revealed. Thus, it may be necessary to retest 
(or to give the individual mental examination to) only a small per¬ 
centage of the group. After all, it is in the highest and lowest ex- 

■'Terman, L. M., Group Test of Menial Ability, 'World Book Company, 
Yonkers, New York, 1920. 
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tremes of intelligence that the problem cases arise, and it is also in this 
same group that the most serious errors or misinterpretations are 
likely to take place. Most present-day tests of mental ability accom¬ 
plish a reasonably accurate placement of the more nearly normal 
group. 

Stanfobd Hevision op Binet-Simon Intelligence Tests® 

This extensive mental scale includes groups of tests suitable for 
the measureincnt of mental ability from three years to fourteen in¬ 
clusive as well as tests suitable for average adults and superior adults. 
There is a complete manual of directions, stating in detail just how to 
administer and score the test. The reliability of the test is around 
.95, and its validity is commonly considered as a standard in con¬ 
structing other group and individual tests of intelligence. The validity 
of the test has been frequently criticized because it is composed of so 
much verbal material, but this same criticism can be applied to many 
intelligence tests. This is one of the most widely used of the individual 
tests of intelligence for classroom use. 

46. Results of Mental Measurement. 

Intelligence-test scores should be interpreted with great care. After 
all, such scores are only estimates of intelligence and should not be 
considered absolute measures. The individual is very complex, and 
many factors may affect the rating of the pupil In the first place, 
the differences in environment and in training opportunities are fre¬ 
quently overlooked in interpreting mental-test scores. The intelligence 
of different individuals may legitimately be compared only when there 
is assurance that the learning opportunities have been the same. This 
fact, incidentally, is difficult to establish. Accordingly, few such 
comparisons are legitimate. Numerous other factors such as the in¬ 
dividual’s inability to see, to read, to bear, or some other temporary 
physical disability may seriously influence the score. Errors in the 
administration of the tests, errors in scoring, and clerical errors in 
transcribing and in computing results must be guarded against at all 
times. The shop teacher should be especially critical of very high 
and very low scores since they are the ones most likely to be in 
error. Every very low and every very high score should be carefully 

sTerman, L. M, el a!. The Stanford Revision and Extension of the Binet- 
Simon Scale for Measuring Intelligence, Warwick and York, Baltimore, 1B17. 
Test material also through Houghton Mifflin Company, Boston, and 0. H 
Stoelting Company, Chicago. 
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rechecked by other group tests or by individual examinations before 
any serious administrative or instructional adjustments are made. 

47. Mental-Age Score. 

The results of mental testing which are of most use to the class¬ 
room teacher are the mental-age scores. These scores are derived 
from the raw test scores and afford the basis for the calculation of a 
number of useful derived scores called quotients. Mental-age norms 
for an intelligence test are commonly secured by administering the 
mental examination to large numbers of individuals of various age 
levels. After the tests have been scored the papers are usually as¬ 
sembled by age groups, all the nine-year-olds being placed in one 
group, all the ten-year-olds in another, etc. In this way the typical 
scores to be expected of individuals of different chronological ages 
may be determined. If the typical point score of a large group of ten- 
year-olds should be 126 on a given mental test, thereafter any in¬ 
dividual making a score of 126 points would be designated as having 
a mental age of ten years. Certain of the more carefully validated 
and standardized group tests have established their mental-age norms 
on the basis of results from large numbers of individual mental ex¬ 
aminations. 


48. Intelligence Quotients. 

Mental-age scores make possible the derivation of a series of 
quotients which are very useful in the interpretation of mental- and 
achievement-test results. The most commonly used quotient of this 
type is the intelligence quotient or I.Q. The I Q. is the ratio of the 
mental age to the chronological age of the individual tested. The 


formula for the I.Q. is I.Q. in which the M.A. is the mental 


age and the C A. is the chronological age, both expressed in months. 
The I.Q. itself expresses the relative mental development of the in¬ 
dividual. If the pupil makes a score on the mental test which gives 
him a mental age identical with his life age his resulting I.Q. is 1,00, 
or 100 as it is usually expressed A pupil is commonly considered 
normal if his I.Q. falls between 90 and 110. Intelligence quotients of 
110 to 120 are above average, and above 120 are superior. Quotients 
above 130 approach the genius class, and quotients of 140 to 150 
indicate most unusually accelerated mental development. Similarly, 
quotients of 80 to 90 are below average. Individuals with quotients of 
less than SO are poor and may be expected to encounter much diffi¬ 
culty in the mastery of abstract material at the junior- and senior- 
high-school level. Quotients of 70 to 80 are very low, and quotients 
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below 60 indicate exceedingly retarded mental development bordering 
on the moron level and idiocy. 

In the interpretation of the quotients derived from group-mental 
test scores it should be remembered that such quotients represent in¬ 
dividual interpretations, whereas, as a matter of fact, the tests on 
which they are based are group tests. In general, intelligence quotients 
based on group-test results should not be utilized for any serious pur¬ 
pose on an individual basis. 

Intelligence-test scores are generally regarded as professional in¬ 
formation to be used in teaching and guidance, but not to be given to 
the pupil or his parents. Long experience with intelligence tests has 
proved this to be a wise policy. The scores from such testa are sug¬ 
gestive to the teacher and should be used only as indicative of ca¬ 
pacity. When these ratings are given to the layman he is likely to 
look upon them as final rather than suggestive and fail to interpret 
them in the light of a professional background. There may be rare 
occasions when it is feasible to give this information to a pupil or 
his parents provided it will aid the pupil or parent in a better under¬ 
standing of the pupil’s possibilities of accomplishment or his future 
needs. It is possible sometimes to encourage a pupil who is doing poor 
work and is discouraged by pointing out to him that he has average 
native ability and can succeed with proper application. Occasionally 
it may be well to point out to a dull pupil that he is doing well in the 
light of his ability. It may be that a lazy, bright pupil can be mo¬ 
tivated by pointing out his failure to capitalize his real ability. Oc¬ 
casionally, parents who punish their children for low marks can be 
made a little more lenient by showing them that their children are 
doing well for their ability. These suggestions must be used with ex¬ 
treme tact and care or they will prove destructive rather than con¬ 
structive. 

The shop teacher must be careful to distinguish clearly between 
intelligence-test scores and tests of achievement, The intelligence test 
gives an indication of the student’s capacity for acquiring information 
largely through the use of abstract processes. The achievement test 
aids in the measurement of actual accomplishment in the class and 
can be used as a partial basis for assigning shop marks. The in¬ 
telligence-test score is of value in teaching and guidance; it should 
not be used directly in marking achievement inasmuch as intelligence 
tests are not measures of achievement in specific courses. 
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II. MEASUREMENT OP SPECIAL APTITUDES 

49. Measurement of Special Abilities. 

The measurement of general mental ability suggests the possibility 
of securing objective evidence of special types of abilities or aptitudes. 
This is a field of measurement in which all teachers in the secondary 
schools should be interested, for adequate educational guidance is be¬ 
coming more and more important at these levels in our educational 
program. Educational and vocational nnisfits, the high pupil mortality 
in certain of our courses, the heavy teacher-burden caused by increas¬ 
ingly large classes, as well as the general embarrassment to the school 
resulting from the misapplication of abilities, all demand that more 
attention be given to this phase of educational measurement. In¬ 
dustrial education teachers, because of their proximity to the problems, 
represent a group which should be greatly interested. 

50. Measuring Mechanical Aptitude. 

An aptitude may be thought of as the capacity of an individual 
for the development of some special ability or skill. Mechanical 
aptitude is the capacity of an individual to deal successfully with me¬ 
chanical devices, and to acquire the knowledge essential to their selec¬ 
tion and operation after suitable training has been given. An in¬ 
dividual who has a large measure of mechanical aptitude, other things 
being equal, will readily succeed when given instruction. On the other 
hand, an individual with low mechanical ability is likely to fail 
regardless of the instruction or opportunities given to work with 
mechanical things. 

The importance of identifying mechanical aptitudes is more ob¬ 
vious when it is realized that at least 40 per cent of the gainfully em¬ 
ployed population in the United States is dependent in some measure 
for its economic success on the possession of mechanical ability. It 
is true, of course, that mechanical ability is only one factor in success 
even in mechanical pursuits, but it is also true that it is quite an im¬ 
portant factor. The industrial education teacher should keep this 
clearly in mind and should blend other important guidance informa¬ 
tion with the evidence of the pupil’s mechanical ability. 

It thus becomes apparent that a knowledge of the student’s me¬ 
chanical ability is important to industrial education teachers from 
both the guidance and the instructional points of view. Knowledge of 
the fact that an individual has low or high mechanical aptitude gives 
the industrial education teacher an objective basis for guiding the 
pupils into or out of vocations which involve high degrees of these 
abilities. It enables the teacher to assign projects better adapted to 
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the individual differences of the pupils in the class. Such a knowledge 
is of value to trade- or continuation'School teachers in selecting in¬ 
dividuals who are likely to profit by the instruction offered. How¬ 
ever, it is well to bear in mind that mechanical-ability tests must be 
carefully administered and interpreted, and that, at best, they are 
merely very suggestive and should not be considered infallible. 

It is widely known that individuals vary in mechanical ability. 
It ia also known that mechanical ability docs not correlate highly 
with intelligence, the quotient usually being around +.40 Stenquist 
pointed this out a number of years ago. This does not mean that 
many individuals with high intelligence as measured by intelligence 
tests do not have high mechanical ability, nor does it mean that in¬ 
dividuals with low intelligence always have high mechanical ability. 
It strongly suggests that there may readily be a concrete aspect of 
intelligence which is necessarily an accompaniment of intelligence of 
the abstract type. Paterson, Elliott, et al° report that there is a 
fairly uniform tendency for test scores on mechanical ability to in¬ 
crease with chronological age between eleven and twenty. The same 
authors found little support for the supposition that men excel women 
in mechanical ability. The only test in which men and boys clearly 
excelled was m the Minnesota Assembly Test, and this was probably 
due to greater experience with the materials. Judging from the data 
available, engineering students are not superior to liberal arts students 
in innate mechanical ability. This emphasizes the fact that guidance 
is made up of many important factors and even in engineering colleges 
mechanical ability is not an infallible guidance factor. 

Industrial education teachers can readily see the guidance value of 
tests of mechanical ability, and fortunately some good tests are avail¬ 
able for use. The three measures of mechanical ability described and 
discussed in the following pages deserve careful study by shop and 
drawing teachers. 

1. Minnesota Mechanical-Ability Tests 

The Minnesota Mechanical Ability Tests are the outcome of re¬ 
search at the University of Minnesota during the years 1923-1927. 
They are probably the most carefully prepared tests of mechanical 
aptitude that have been published for general use. The tests are quite 

** Paterson, Elliott, Anderson, Toops, Minnesota Mechanical Ability Tests, 
University of Minnesota Press, Minneopolis, Minnesota, 1930, pp. 282-284. 

Materials for the Minnesota Mechanical Ability Tests may be obtained 
from the Marietta Apparatus Company, Psychological Equipment, Marietta, 
Ohio. 
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reliable, and their validity has been carefully checked against objective 
criteria. 

When administered according to directions the Minnesota Me¬ 
chanical Ability Tests will give results which will be very useful in 
teaching and guidance. The battery includes the following six tests; 

(a) Minnesota Paper Form Board, Series A and B, for either 

group or individual testing. 

(b) Minnesota Spatial Relations Test (individual test). 

Boards A and B. 

Boards C and D. 

(c) Minnesota Assembly Test (group or individual). 

Long form. 

Box A, B, and C. 

Short form. 

Box 1 and 2. 

(d) Minnesota Interest Analysis Test. 

(e) Packing Blocks Test. 

(/) Card Sorting Test. 

The authors report the following coefficients of reliability and 
validity for these tests. 

TABLE 23 


Coefficients of Reliahimtt and Validity on Minnesota Mechanical 
Auility Tests “ 


Test 

T ^ 

• 71 

Validity Coefficient 
(Between test and 
quality criterion) 

Minnesota Assembly, Boxes A, B, C.. 

.94 

.55 

Minnesota Paper Form Board, A and B. . 
Minnesota Spatial Relations, Boards A, B, 

.90 

.52 

C. D. 

.84 

.53 

Packing Blocks. , . 

.77 

26 

Card Sorting , , . 

90 

19 


stepped up by Spearman-Brown formula. 


The manual of directions gives instructions for administering, scor¬ 
ing, and interpreting results. The authors state that the examiner need 
not be a trained psychologist to administer the tests, but that he 
should be thoroughly familiar with the test to give it successfully. 

n Op . at . 
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The test is quite elaborate and rather expensive in time and money, 
but the combined battery will yield a satisfaetonly valid and reliable 
score for most guidance purposes. 

2. Stenquist’s Assembling Tests op General Mechanical 

Ability 

This is one of the first assembling tests of mechanical ability to be 
widely used. The test is comiiosed of a small rectangular box divided 
into ten compartments. Each compartment contains a small me¬ 
chanical device which is common in the experience of most people. 
Some of the items selected by the author arc a mouse trap, push 
button, simple lock, and bic3'-c]e bell. These mechanical devices are 
arranged in order of difficulty of response. In scoring, ten points are 
allowed for each device, and the score on each item is the number 
of correct points in assembling the device. Stenquist also suggests a 
short method in which just the devices almost or completely assembled 
are counted and given as tlie score. The second method would be 
less reliable because it disregards partial accomplishment. The tost 
also gives a bonus of onc-lialf point per minute for each minute under 
30 minutes in responding to the test. 

According to the literature this test has a reliability of .70 and 
correlates with intelligcnce-tosfc scores about +.20 to +.30. The 
test measures certain aspects of mechanical ability, but has too low 
a reliability to be used with assurance in measuring mechanical ability 
in individual cases. It correlates with teachers’ marks in mechanical 
subjects as high as +.80. Paterson, Elliott, and others found that 
the test correlated +.26 with the objective criterion used in the 
validation of the Minnesota Tests of Mechanical Ability. They also 
found that, by increasing the length of the test, its reliability and 
validity were greatly unproved. Norms for different age groups are 
given in the accompanying table. 

TABLE 24 
Pehcentilb Norms 


Perccutiles 


Age 

5 

10 

25 

50 

75 

90 

95 

Tirteen. 

10 

17 

3,2 

4.6 

6,2 

72 

7,9 

Tom teen . 

. , 1.0 

14 

24 

43 

6,0 

7,4 

8.0 

Thirteen . 

1,0 

15 

2,5 

39 

5,3 

66 

7.7 

Twelve . 

,7 

10 

00 

29 

3,8 

52 

6.8 


C. H. Stoelting Company, Chicago. 
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3. Stenqtjist Mechawical Aptitude Tests 
Tests I and II 

These pcncil-and-paper tests of mechanical aptitude arc made up 
of pictures of mechanical things which are common in the experience 
of most people. The testa are not two equivalent forms, but the two 
parts (I and II) are to be used to supplement each other. In general 
the student taking the test has to recognize mechanical things that 
belong together or work together and answer questions about parts 
or operations of machines. The working time on the first test is 45 
minutes, and on the second 50 minutes. The two forms together re¬ 
quire 173 responses. 

The reliability of the test appears to be around .75. Paterson 
and Elliott have shown that this can be increased to .89 or .90 by 
increasing the length of the test. The validity as checked by the best 
known criterion is lower than the assembly test even when the reli¬ 
ability has been corrected. Correlation with the obj ective criterion of 
the Minnesota tests is around +.30. The test correlates as high as 
+ .60 with teachers’ marks in shop courses. 

in. SUMMARY 

The methods of measuring general and special types of mental 
abilities are discussed in this chapter. Intelligence, as treated in this 
discussion, is considered to be the capacity for learning, plus the in¬ 
formation, skills, and attitudes which the individual has gained from 
reacting to his environment. Certain individuals react readily to 
abstract stimuli; others respond most readily to concrete and tangible 
situations. For this reason there seems to be a real need for both 
abstract and concrete types of mental stimuli. 

Intelligence tests are commonly classified as individual mental ex¬ 
aminations and group tests of mental ability. The results of mental 
testing which are of most use to the classroom teacher are the mental- 
age scores. These and all other scores derived from mental tests 
should be regarded as professional information of the most confidential 
type and used accordingly. 

An aptitude is the capacity of an individual to develop special 
abilities or skills. Mechanical aptitude represents the potential ability 
of the individual to deal successfully with mechanical devices, and the 
knowledge essential to the selection and operation of such devices 
after a suitable period of training. An early Imowledge of special 
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aptitudes on the part of individual pupils is of great importance to 
the teacher of industrial education courses. 

SUMMARY EXERCISES FOR DISCUSSION 

1. State what seems to you to be a practical and accurate definition of intelli¬ 

gence. 

2. List the outstanding advantages and disadvantages of group menial tests. 

3. What are the major advantages and disadvantges of the individual mental 

examination over the group test? 

4. What is the difference between the mental age and the I.Q ? 

5. Show how the shop teacher needs mental-test results as a protection against 

the possible misinterpretation of achievement-test results. 

6. What is the basis for the statement that aptitude tests are tests of special 

types of intelligence? Is it true? 

7. Why should the shop teacher be especially concerned with results from apti¬ 

tude tests? 

8 From available sources make a list of the mental tests and special-aptitude 
tests which would seem to provide the most useful information for the 
industrial education teacher. 
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CHAPTER IX 


TESTS IN RELATED EDUCATIONAL FIELDS 

51. Relation of Other Educational Achievement to Industrial Arts. 

Achievement in industrial education cannot be measured ade¬ 
quately 'Without the supplementary information procurable only 
through the use of educational tests selected from other, related fields. 
Just as it is impossible to give a proper interpretation to the results of 
achievement testing in any field of subject-matter without the use of 
such supplementary information as mental tests provide, so it is 
difficult if not impossible to secure a complete evaluation of instruction 
in the industrial arts without the use of definite and accurate measures 
of the many factors which contribute to achievement in these subjects. 

Achievement in the content subjects is limited to a very high de¬ 
gree by the student’s reading ability. The comprehension of the 
precise types of directions, symbols, and instructions given in the in¬ 
dustrial arts subjects is basic. Certain skills in arithmetic, algebra, 
and the sciences are essential in shop "work. Mastery of certain Eng¬ 
lish usages and mechanics is as essential to acceptable achievement 
in this field as in almost any other. A reasonable skill in freehand 
drawing, lettering, and handwriting is also an important limiting fac¬ 
tor in industrial arts achievement In this chapter a number of the 
more useful educational tests selected from important fields related 
to industrial education are described. 

52. Reading Tests, 

The development of the ability to read is one of the most im¬ 
portant educational and vocational accomplishments of the school. 
Achievement and school progress depend to a very large degree upon 
reading ability, and the higher up in the grades the pupil progresses 
the more important does reading ability become. In fact, in the high 
school and college, reading ability is the most important single means 
by which knowledge and information are secured. The recognition of 
silent rending as a basic study tool has done much to improve the 
quality of initial instruction given in the subject. It has also resulted 
in stimulating the development and use of effective remedial instruc¬ 
tion in this field. 
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The elementary school is expected to develop the general reading 
skills, including an effective eoinpi'ehension of the content of material 
read at an economical and efficient rate. Unduly slow reading is a 
handicap, just as is poor comprehension. However, the elementary 
school deals with reading problems involving comparatively common 
usages and simple vocabularies. It does not concern itself especially 
with technical terms and symbols used in the special subjects, such 
as the industrial education courses. Thus it becomes largely the re¬ 
sponsibility of industrial education teachers to train their pupils to 
read the technical phases of their specific subjects. One of the most 
effective ways to accomplish this is for the teacher to determine at 
once the general reading ability of his class. In any event, it is quite 
certain that the pupil must have a general reading ability before he 
can acquire the technical reading ability required in many of the 
specialized industrial education courses. 

A number of excellent general tests of silent reading ability are 
available, but thus far no one has developed a particularly outstand¬ 
ing device for the measurement of the pupil’s ability to read the 
technical content of specialized courses. Special reading tests de¬ 
signed to parallel the teaching in the industrial education subjects are 
badly needed. Reading tests using technical content selected from 
the fields of woodworking, drawing, sheet-metal working, auto me¬ 
chanics, electricity, printing, etc., would be of great value. 

The two reading tests described in this chapter are primarily 
tests of reading ability as it exists at the secondary-school level. 
The tests have been selected as being best suited to meet the needs 
of teachers and supervisors of industrial education courses in securing 
objective and detailed analytical information concerning the different 
aspects of the reading ability of their pupils. The results from the 
use of the tests and the general specifications upon which they are 
built should prove helpful to industrial arts teachers in constructing 
and using technical reading tests for the various industrial education 
subjects. 

1. Haggerty Reading Examination, Sigma 3 
Forms A and B ^ 

This distinctly valuable test is designed to measure general silent 
reading ability from the fifth grade through the twelfth. The total 
score on the test indicates a measure of general reading comprehension. 
The test is composed of three subtests of vocabulary, sentence reading, 

^Haggerty, M. E. and Laura C. Reading Examination, Sigma 3, World 
Book Company, Yonkers, New York, 1920, 
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and paragraph reading. The vocabulary content of the test was vali¬ 
dated by selecting the common words used in seventh- and eighth- 
grade readers and history texts. The actual working time is 28 
minutes, but the administration of the test requires about 45 minutes. 

2, Iowa Silent Reading Test; Advanced 
Fohms a and B (Revised) “ 

This test is designed to secure an analytical measurement of the 
silent reading skills used in reading of the work-study type. By the 
use of a series of tests sampling into several different types of reading 
skills the total comprehension score is intended to reveal general read¬ 
ing ability. The scores on the separate test parts afford the basis for 
the analysis of the strengths and weaknesses of individual students. 
The several parts of the test cover the following unit skills which con¬ 
tribute to the student’s ability to read and to work with books: 

Test 1. Paragraph meaning. 

Science material. 

Literary material. 

Test 2. Word meaning. 

Social science vocabulary. 

Science vocabulary. 

Alathematics vocabulary. 

English vocabulary. 

Test 3. Paragraph organization. 

Selection of central idea. 

Outlining. 

Test 4. Sentence meaning. 

Test 5. Location of information. 

Use of the index. 

Selection of key words. 

Test 6. Rate of silent reading. 

The total working time for this test in its recently revised form 
is 35 minutes. This makes it easy to administer the test within a 
class period. In spite of this relatively short working time the reli¬ 
ability of the test is high. The reliability coefficients and the P.E.’s of 
scores reported in Table 25 indicate that scores on these tests may be 
taken as very accurate measures of the silent reading abilities of high- 
school or college students. 

2 Greene, H. A., Jorgensen, A. N., and Kelley, V. H., Iowa Silent Reading 
Test, Advanced. Revised Edition for High Schools and Colleges. World Book 
Company, Yonkers, New York, 1931. 
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TABLE 25 

RiiuwiiLm- OP Iowa Sii.Em'Rn-unNO Test, Advanced* 




Lowest 



Highest 


Tcsit 

r 

P E. Score 

Giado 

1 

P E. Score 

Griule 

1 

M 

.65 

9 

91 

62 

13 

2 

87 

1.G3 

9 

.94 

108 

11 

3 

.86 

53 

11 

94 

.64 

12 

4 


.94 

9 

.95 

.65 

10 

5 

80 

94 

9 

90 

63 

13 

ToLil Compi'. 

.95 

5.73 

12 

96 

4 98 

9 


* Quoted fioiu lablu in cxaiuinur'a ninnuii.1 


53. English Tests. 

The social iinpurtancc of the use of correct language habits is so 
great that no teacher can afford to relax for a moment in his demands 
for correctness in the oral and written language of his students, 
Teachers of industrial education must share this responsibility in spite 
of the facts that very often tins is a field outside their specific realm 
of interest, and that the written language used in their courses is 
usually quite limiled. The demands made on the oral language skills 
arc usually as extensive, however, as they arc m other subjects. The 
professionally minded tciiclier of the industrial education subjects will 
be just as careful to watch and correct the language habits of his 
pupils as he is to watch his own language usages. Correct habits of 
speech and writing come only through extensive and continuous prac- 
ticc in correct usage. The inclustrial arts teacher must be in a posi¬ 
tion to cooperate at all times with the English teachers, whose major 
responsibility it is to sec that these correct habits function in school 
and the situations of daily life. 

Achievement in English expresses itself in a number of different 
ways, most of which are measurable with tests of some type. The 
most commonly measured phases of English are in the fields of word 
usage, grammar, and the mechanics of written composition. General 
merit of the total written language production is also measurable with 
the help of certain quality scales. Thus far, oral composition has 
eimded most attempts to objectify its measurement. A few tests and 
scales selected from the large number available in this subject are 
described and evaluated here from the standpoint of their use to the 
teacher of the industrial arts subjects. 
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Measurement of Composition. English composition is ordinarily 
measured in one of two ways. One method concerns itself with the 
more or less objective evaluation of the general qualities of merit which 
the production possesses. This procedure makes use of a scale com¬ 
posed of written productions of different degrees of merit which have 
been scaled or evaluated and arranged in ascending order of merit 
in accordance with a numerical scale. Each different specimen is 
assigned a numerical value in terms of the relative merit it possesses. 
In general, the lower the merit of the specimen the lower the quality 
scores assigned to it. In actual use the scale is not taken into the 
classroom, but is used by the teacher as a means of assigning general 
merit ratings to the written productions prepared by students under 
rather carefully controlled conditions, and on selected topics and sub¬ 
ject-matter. The other method of measuring composition is by check¬ 
ing its form and freedom from mechanical errors. One of the very 
useful scales for the measurement of English composition, the Willing 
Composition Scale,^ makes use of both these procedures. 

The compositions, after being rated for story value or general 
merit, care rated for form value by making a eareful check of all spell¬ 
ing, punctuation, capitalization, grammatical, and usage errors. The 
total number of such errors is then divided by the number of words 
comprising the composition. This result is then multiplied by 100, 
which expresses these form errors in terms of the number of such 
errors per 100 words of composition. The number of forai errors per 
100 words declines as the quality or story value of the composition 
rises. 

The Willing Scale is the only one of the commonly used scales 
which makes any attempt to combine the ratings for form and quality. 
The Thorndike Extension of the Hillegas Scale, the Hudelson Com¬ 
position Scales, the Nassau County Extension (Trabue) of the Thlle- 
gas Scale are all very useful general merit scales but confine their 
measurement entirely to composition merit. It seems quite likely 
that most industrial arts teachers interested in making any intensive 
check on the merit of the written work of their students will find the 
form values and the story values resulting from the use of the Willing 
Scale the most useful measures. 

Measurement of Grammar. Two somewhat contrasting types of 
grammar tests are described in this section. 


“ Willing, Matthew H, The Willing Composition Scale, Public School Pub¬ 
lishing Company, Bloomington, Illinois, 1918. 
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The Iowa Grammar Information Test ^ is resigned to meet the 
need for a test of the purely informational aspects of English gram¬ 
mar. In addition to its survey use in English classes, it should prove 
to be a valuable measure of the formal background of grammar needed 
hy students 'who are beginning the study of a foreign language. By 
means of 80 objective exercises of the three-answer multiple-choice 
type it samples into almost all the commonly taught phases of Eng¬ 
lish grammar. Two equal and parallel forms are available. Percen¬ 
tile norms are based on 1557 eases in Grades VII to XII 

The Kirby Grammar Test “ is intended to be used in the measure¬ 
ment of usage and grammatical errors in Grades VII to XII. The 
pupil is tested on his knowledge of verbs, pronouns, and certain mis¬ 
cellaneous usages. For convenience in administration, the exercises 
are arranged in five divisions each containing usage exercises of the 
alternate-response type. The pupil is required to select the correct 
form for a given exercise and then to indicate (by recognition) the 
grammatical rule which governs its use. 

The reliability of the score on the principles test is about .90, 
but on the sentence test is only around .60. Norms are given for 
Grades VII to XII, but there is not a great difference between the 
norms for the different grades. This seems to indicate that pupils 
do not improve much in grammar during their secondary-school work. 
The actual working time of the test is about 35 minutes. 

Language Usage. The language-usage tests described in this sec¬ 
tion illustrate two different types of measurement in language. The 
first is an analytical test sampling many different language abilities. 
The second is a general survey of language usage based on the recogni¬ 
tion of error. 

The Iowa Elementary Language Tests “ arc designed for survey 
purposes in Grades IV to IX inclusive. However, the reliability of 
measurement on the different parts permits a very useful type of 
analysis of language limitations. The eleven phases of language 
ability sampled by this test range over a total of 304 different items 
with a total possible score of 338 points. In Test 1 which deals with 
two phases of word meaning tho four-answer multiple-choice type of 


* Cram, Fred D , and Greene, H. A., The Iowa Gmmmar Information Test, 
Bureau of Educational Research and Service, Extension Division, University of 
Iowa, Iowa City, 1935. 

“ Kirby, Thomas J., The Kirby Grammar Tests, Bureau of Educational Re¬ 
search and Service, Extension Division, University of Iowa, Iowa City, Iowa, 1920, 
“Greene, H. A., Bnllengcr, H. L., Iowa Elementaiy Language Tests, Educa¬ 
tional Test Bureau, Minncapolis-Phlladelphin, 1929. 
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exercise is used. Alternate-response exercises are used in two of the 
three tests measuring phases of language usage. The .recognition- 
correction type is used in Tests 2-B, 6-A, and 6-B. A novel type of 
technique utilizing keyed brackets is used in Test 5. 

The Wilson Language Error Test ’’ is available in two parts, each 
consisting of three forms. The forms consist of short stories of about 
300 words which contain a number of common language errors. The 
pupil is to read the story and correct the language errors. The test 
is simple to administer and, when at least three forms are used, has 
valuable diagnostic power. The errors included in the tests are those 
commonly made as indicated by studies of pupils’ errors in several 
different schools. The reliability of the test is about .80. The test 
should prove valuable to industrial education teachers in diagnosing 
common language errors. Norms are given for Grades VII to XII but 
they show approximately the same levels of achievement in all the 
grades. 

54. Vocabulary Tests. 

Several good general vocabulary tests have been developed. These 
are of some value to teachers of industrial education but they are 
included here more for the suggestions they give concerning methods 
that may be employed in developing suitable vocabulary or word 
meaning tests to parallel the different industrial education courses. 
Hunter’s^ W~^ Trade Names Test in Woodwork is one of the pioneer 
efforts of industrial education teachers to develop tests along the line 
of technical vocabulary. The inability of a pupil to understand the 
meaning of words used in a given course does not necessarily mean 
that he should not take the course, but indicates the need for special 
instruction and drill in word meaning early in the course so that the 
pupil can better profit from the instruction. 

The Presscy Technical Vocabularies of the Public School Subjects ° 
should be most suggestive to industrial education of possible methods 
of developing tests upon the technical vocabulary used in the indus¬ 
trial education subjects. This vocabulary list, which includes technical 
vocabularies for fifteen school subjects, contains a list of technical 
words pertaining to woodwork and elementary metal work, but it is 
not entirely adequate for industrial education purposes because it 
covers only these two courses. 

’ Wilson, G M., Wilson Language Error Tests, World Book Company, Yon¬ 
kers, Now York, 1923. 

® Hunter, W L., Shop Tests, The Manual Arts Press, Peoria, Illmois 

“Pressey, Luella C,, Technical Vocabularies of ‘the Public School Subjects, 
Public School Publishing Company, Bloomington, Illinois, 1923. ' 
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The procedure in selecting and rating the technical vocabularies 
was as follows. First, all unusual or technical words which appeared 
in commonly used textbooks of the subjects treated were tabulated 
and classified; second, the terms were rated according to importance 
by a group of special teachers of the subjects; and third, terms were 
classified as essential that were checked by more than half of the 
teachers. This is in no sense a test but is a suggestive vocabulary 
study for fur-ther development in the industrial education field. 

55. Spelling. 

Industrial education teachers have a distinct responsibility to teach 
the pupils in their classes to spell the technical words peculiar to their 
courses and to aid the other teachers in the school in maintaining 
proper spelling levels in written work. To equip the cliilcl with a 
method of learning to spell and to teach the spelling of commonly used 
words is the specific function of the elemeiitnry school, but it takes 
continual cooperation by teachers of all subjects to assure the lasting 
assimilation and mastery of these fundamental skills. Teachers of 
industrial education should recognize this responsibility. 

The majority of available spelling tests and scales have been de¬ 
veloped for the elementary school. However, at least two such spell¬ 
ing scales arc definitely designed for use at the secondary-school level. 
These two scales which are briefly described here should prove of 
very definite value to industrial education teachers in measuring gen¬ 
eral spelling ability on the junior- and senior-high-school levels. They 
should also prove suggestive for the construction of spelling scales 
dealing with the technical words w’hich are an integral part of instruc¬ 
tion in industrial education. 

1. Sixteen Spelling Sc.vles Standaedized in Sentences for 
Second,\BY Schools 

These scales, frequently called the Seven-S Scales, consist of 16 
separate and scaled lists of 20 wmrds each. It requires about 5 min¬ 
utes to give any one of the scales, and for individual scores it is ad¬ 
visable to use two and combine the scores. The tests have been care¬ 
fully prepared and afford a very satisfactory means for industrial 
education teachers to measure the general spelling ability of their stu¬ 
dents. The scales do not measure the ability to spell the related tech¬ 
nical words in industrial education. 

lOHudolson, EluI, Stetson, F. L., and Woodyard, Ella, Sixteen Spelling 
Scales Standardized in Scnl'enccs for Secondary Schools, Bureau of Publications, 
Teachers College, Columbia UnivensiLy, New York City, 1920. 
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The validity of the scales is based on the secon^ and third thousand 
most coinraon words as found in a composite list based on four sep¬ 
arate vocabulary studies. The words in the scales are so selected 
and arranged that each word is onc-tenth of a sigma unit more diffi¬ 
cult than the preceding word. In administering the tests the words arc 
given in sentences, but the pupils are required to write only the one 
word that is to be tested. The reliability is reported as being high. 
Norms are provided for Grades VII to XII. 

2. Simmons-Bixleh Standard High-Sci-iool Spelling Scale 
Forms I, II, III, IV “ 

This unusually valuable spelling scale for high-sehool use is based 
upon an extensive program of investigation in higli-school spoiling 
undertaken by Mr. Simmons, and supplemented by a revision of the 
original material under the direction of Dr. Bixlcr. The result is a 
series of four forms of scales each containing a preliminary spelling 
test of 100 words, and 64 scaled lessons of 40 words each. The source 
of the vocabulary is the socially significant list of words comprising 
Horn’s Basic Writing Vocabulary; 10,000 Words Mosi Commonly 
Used in Writing after the elimination of a number of abbreviations, 
irregular forms, and a group of words spelled correctly by 90 per cent 
or more of high-school freshmen. The scale location of each word is 
based upon 200 spellings per grade. 

In addition to the scaled tests, an alphabetical list of 2910 words is 
presented with the percentile placement of each word for students in 
Grades IX to XII. Such a scale constitutes a valuable source of in¬ 
structional material for use in conducting the spelling “hospital” as 
well as a useful source of test material of known difficulty. 

56. Writing. 

The development of the initial skills in waiting is one of the func¬ 
tions of the elementary school, but if students are to be legible writers 
after the formal education period they must be checked continuously 
by the teachers in all subjects. Although it is desirable that the basic 
writing habits become more or less automatic, it is also desirable that 
conscious writing be perfected to such a degree that it will still be 
legible and of good quality when it does become automatic. Indus¬ 
trial education teachers can aid in the development of good writing 

Simmons, Ernest P., and Bixler, H. H., A Standard High School Spelling 
Scale, Turner E. Smith anef Company, Atlanta, 1928, 

11 Horn, Ernest, A Basic Wnting Vocabulaiy, State University of Iowa Mono¬ 
graphs in Education, Series 1, No. 4, University of Iowa, Iowa City, 1926. 
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habits by deinandii^g legible writing and printing from the students 
and by displaying high-quality specimens of such work on the bulletin 
board or on wall charts. 

Writing has two characteristics which are important in rating, 
namely, quality and speed. Quality is usually determined by having 
samples of handwriting rated by qualified judges and the samples 
placed in order of merit on a linear scale. Samples of pupils’ writing 
obtained under standard conditions may then be compared and rated 
according to the value of the specimen on the scale which it most 
nearly resembles m quality. Speed is determined by counting the 
number of letters of standard copy w'rittcn in one minute. Eighty 
letters per minute is considered a satisfactory speed for pupils in the 
ninth grade. 

Speed and quality are not rated together. If a pupil of average 
writing ability writes slowly and carefully the quality of his writing 
may improve. If he writes very rapidly there is likely to be a reduc¬ 
tion in quality. Both speed and quality can be improved through 
practice. If a pupil is to reach a maximum speed and quality he must 
also have a good technique (proper position at desk, hold pencil or 
pen and paper in correct position, etc.). 

The rating of handwriting is valuable to industrial education teach¬ 
ers from another angle since it is similar to the rating of quality of 
workmanship on industrial education projects. It is almost identical 
with the rating of lettering in drawing, and it has many factors in 
common with the rating of soldering, riveting, boring, and splicing 
wire. It is also well to note at this point that speed and quality are 
measured as separate items. This is also true of speed and quality 
in rating the results of manual operations in industrial education. 

Two handwriting scales which should prove valuable to teachers 
in rating quality and m diagnosing faults m handwriting have been 
selected for description. A copy of these scales might well be posted 
in the shop and used by students to check and analyze samples of 
their handwriting and as constant reminders to improve their own 
writing. 

1. Ayees Handwriting Sc^vle 

The Ayres Handwriting Scale now in most common use is known 
as the “Gettysburg Edition” because the samples in the scales are 
based upon copy from the first four sentences of Lincoln’s “Gettysburg 
Address.” The scale consists of nine widely varying specimens of 
handwriting graduated by tens from twenty to ninety. Each section 
Russell Sage Foundation, New York, New York, 1912. 
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on the scale is represented by a twelve-line section from the "Gettys¬ 
burg Address.” The relative merit of the specimens was determined 
by the differences in the lengths of time required by trained judges 
to read each sample. Thus legibility becomes the criterion of merit. 
This procedure is distinctly in contrast with that used by Thorndike 
in the development of his Handvniing Scale. The results of the use 
of the two types of scales in the classroom are quite similar, however, 
in spite of the differences in their construction. Available standards 
for the various handwriting scales are established only for the ele¬ 
mentary-school grades and accordingly are of little value above the 
eighth grade. However, it may be useful to point out that the writing 
of junior- or senior-liigh-school pupils should be quality 60 or above 
on the Ayres Scale at a speed of approximately SO letters per minute. 

2. Freeman’s Di.'i.GNOsTic Handwriting Scale’’® 

No discussion of measurement of handwriting would be complete 
without at least a brief mention of the Freeman Chart for the Diag¬ 
nosis of Handwriting Faidis. By the use of this analytical chart, 
attention may be focused upon such qualities as uniformity of slant, 
uniformity of alignment, quality of line, letter formation, and letter 
and word spacing. Slant of letters may be revealed by drawing lines 
thi'ougli the letter indicating their slant If the lines are not parallel 
the lack of uniformity in letter slant is revealed. Alignment may be 
shown by drawing lines parallel with the bottom and tops of the 
smaller letters. Weaknesses in letter formation are more difficult to 
reveal and to classify since there are so many different types. Im¬ 
properly closed a’s and o's and badly formed n’s and w’s are common 
types of letter-formation difficulties. Too crowded as well as too 
widely spaced letters and words operate to reduce the quality of 
writing. The critical and ambitious teacher of industrial arts sub¬ 
jects will find many opportunities to use this effective analytical scale 
in bringing about distinct improvements in the handwriting of his 
students. 

57. Measurement of Mathematics. 

Throughout the work in industrial arts subjects, frequent demands 
are made on certain basic mathematical skills. In general, these skills 
are presented for initial learning in the courses in arithmetic, algebra, 
and plane geometry. 

Thorndike, Edward L., “Handwriting,” Teachers College Record, Vol 11: 
1-93, March, 1910. 

’’^Ereeman, E. N., Freeman Chart for Diagnosing Faults in Handwriting, 
Houghton Mifflin, Company, Boston, 1914. 
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Arithnirtir. AlllKmjfli Liritlimetic is riglilly considered an clcmeii- 
tary-schuiil subject, it is lui imiiortant factor iu achlevemeni m ninny 
sccoinlary-seluHil subjects. Anllunelicul skills are in demand in prac- 
tundly all clas<riH)m and shop activities in the indu.strial arts. Ac¬ 
curacy 111 making calculatkins in connection with shop work and other 
industrial subjects is an imiiortant factor in such achievement. Among 
the arithiiu'tic tests rvliicli are most likely to be of use to the teacher of 
industrial ediu'ation are such tesl.s as the Compaas Survey Tests, Ad¬ 
vanced Exanimationj'" fur Giades IV to YIII, the New Stanford 
Achii rcnant Arithnutii: Test, ” for (Iradcs IV to IX, the Otis Reason¬ 
ing T(sfs in AntliiaelicA^ for Clradcs IV to IX, and possibly certain 
selected narrow fniictiuii tests in arithraetic such as the Compass Diag¬ 
nostic Tests. 

High Seiiool Mathcmatic.s. Algebra and plane geometry represent 
the jihascR of liigh-schoul mathemuties of most interest to the teacher 
of the industrial arts .subjects. Among the first-year algebra tests 
which may readily ho of use to the simp teaclior is the Columbia Re¬ 
search Ijiireau Algebra Test.'-' In its pro.seiit form this test is in two 
parts. Part I is designed to cover the algebra eomiuunly taught in the 
first semester of the course. Part II covers tlio second seincstcr's work, 
A much more intensive type of measurement is provided by the Iowa 
Unit-Achievement Tests in Algcbia.-" These tests are iu two equal 
forms eaidi made up of six tests covering the entire year’s work in 
first-year algebra. The standards represent achievement as of the 
time when the original instruction on the material was completed. 

Achievement in plane geometry may he effectively surveyed by 
such cnd-of-the-ycar tests us the Schorling-Sanfurd Plane Geometry 
Tests or the Columbia Research Bureau Plane Geometry 'Tests.-^ 

i*" Greene, II. A, lluii'lil, 1'. B, Rueh, G. M., and Sludebaker, ,1. W., The 
Compass .Siinu j/ Tests, Scoll, EureMiiiin and Coiiipnny, Cliieugo, 1927. 

u Ilueli, G. M, Temuiii, L. M,, uml Kelley, T. L., The New Slanjord 
Achievi luLhl Tests, Wurlil Book Coin puny, Yonkers, New A'ork. 

Oils, Aidiiir S , Otis Itvinoniiiff Tests in Aiithmehc, IVorlJ Book Company, 
Yonkers, New Yoik, 1923. 

u'Otis, Aitliur and Wood, Ben D., Columbia llcscarch Bureau Algebra 
Test, World Book Company, Yonkers, New A'ork, 1927, 

=" Greene, H A., and Piper, A. H , The Iowa Vnil-Achicvcmcnl Tests in 
FiisL-Ycar Algibm, Bureau of Eduealional Rcsoareh and Sorviee, Extension Di¬ 
vision, University of Iowa, Iowa City, 1931 

=1 Seliorling, Raleigh, and Sanford, Vera, The Sehorbng-Sanfoid Plane Geom- 
chy Tests, Teacher,s College Bureau of Publieations, Columbia University, New 
Y^ovk City, 1925 

==Hawke.s, Herbert E., and Wood, Bon D, Columbia Research Bweau Plane 
Geometry Test, World Book Company, Yonkens, New York, 1926. 
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For periodical measurement of acliicvcmcnt over relatively small sec¬ 
tions of plane geometry instruction tests such as the Lanc-Greene 
Unii-Achievement Tests in Plano Geometry -•* may be used. 

In addition to the tests in arithmetic, algebra, and plane geometry 
which have been described in this chapter there is a real need for tests 
of related mathematics to parallel the several industrial education 
courses in electricity, auto mechanics, slieet-mctal working, drawing, 
and printing. Something similar to the type of inventory measure¬ 
ment secured by the Kilzer-Kirby Inventory Test for ike Mathematics 
of High-School Physics is greatly needed in these fields. Hunter 
has recognized this need and has done some pioneer work by devel¬ 
oping short tests in shop matheiiiatics, shop arithmetic, and geometry. 
These tests arc not standardized or long enough to be highly reliable, 
but they arc of some value for measuring the mathematics related to 
industrial arts and for the suggestions they offer for further devclop- 
inciit along similar lines. 

58. Measurement in Sciences Related to Industrial Arts. 

Certain contributions of the high-school sciences are apparent in 
many of the industrial arts courses. Accordingly complete measure¬ 
ment of achievement in these courses must at least include some atten¬ 
tion to the fields of high-school physics, chemistry, and general science 
Such survey tests as the Columbia Research Bureau Physics 'Test ““ 
will be found to be very effective measures of ciul-ol-thc-year achieve¬ 
ment in physics. In a similar w'ay the Columbia Research Bureau 
Chemistry Test will be an effective survey instrument for use by the 
industrial arts teacher. General science covers so many different 
phases of the sciences that without doubt it is one of the most useful 
fields to survey in any attempt to discover the range of information in 
the sciences held by industrial arts students. For this purpose one of 
the most useful tests is the Ruch-Popenoe General Science TestH 
Standards are provided for one-semester and year courses in this 
subject. 

Liinc, Ruth, and Greene, H A., The Lane-Orconc TJniL-Achievement Tests 
in Plane Geometry, Ginn and Company, Boston, Mass, 

=‘Iulzer, L. R, and Knby, T. J., Inventory Test jor the Mathematics of 
High-School Physics, Public School Publishing Company, Bloomington, Illinois, 
1929. 

Hunter, W. L, Shop Tests, Scries No. 3, The Manual Arts Picss, Peoria, 
Illinois, 1927. 

Faiwell, H. W., and Wood, Ben D., Columbia Research Bweau Physics 
Test, World Book Company, A^onkers, 1926. 

Ruch, G. M., and Popenoe, H. F, Ruch-Popenoe General Science Test, 
World Book Company, Yonkers, New York, 1923. 
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SUMMARY 

Acliievcmcnt in indusiriul cducatuiii cannot be completely and ef¬ 
fectively measured rvitlumt the use of supplementary educational testa 
selected from other related fields. Since, the language skills as repre¬ 
sented by reading, language usage, grammar, spelling, handwriting, 
vocabulary, and composition abilities are so basic and so fundamental 
to achievement, in industrial edueahoii suhieets, considerable atten¬ 
tion is given to the discus.sion of tests in these fields. The social im¬ 
portance of the, correct use of those language skills is so great that no 
teacher can afford to relax for one moment in liis demaiul.s for correct¬ 
ness in the oral and written language habits of liis students. 

Demands are made on certain of the high-school science courses 
by' many of the units of work in industrial education courses. Ac¬ 
curacy and speed in making certain mathematical calculations in con¬ 
nection with shop work are also desirable aecomphsliuicnts. Accord¬ 
ingly, the teacher of industrial education will wisli to sample somewhat 
liberally the abilities in these other related fields of educational 
achievement. 


SUMMARY EXERCISES FOR DISCUSSION 

1. What educational fields appear to bo most closely related to achiovcmcnt 

in industrial arts? 

2. Ill wliat specific ways does achiovenicnl in indiistiial arts and other higli- 

school subjects appear to be related to the ability to read rapidly and 
well? 

3. Catalogue the major language skill.s which should receive the attention of 

the teacher in induslual arts subjects? 

4. In your judgment, what is the most' acceptable basis for the selection of a 

liigh-acliQol spelling vouabulary? 

5. Whnt is the responsibility of the teiiehor of industrial arts subjects for 

satisfactory mastery of spelling and hanilwriting on the part of his stu¬ 
dents? 

G. Secure a copy of the Ayics llandmiling Scale and rate at least a dozen 
Bamplcs of handwriting representing a wide range of ciuality. After two 
or three days late the samples again without reference to the scores 
pieviously assigned. On what percentage of samples did your two sets 
of marks agree within five points on the scale? 

7. List a few of the more important arithmetical skills which appear to persist 

into the high school 

8. Why are there no adequate diagnostic test.s in algebra or geometry? 

9. What special procedures can you suggest for improving problem solving 

either in arithmetic, algebra, or in the sciences? 

10. Compare two selected algebra or geometry tests showing complete lists of 
specific skills measured by each. 
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TESTING TECHNIQUES IN INDUSTRIAL EDUCATION 

I TYPES OF OBJECTIVE TEST EXERCISES 

f'piTain typos (if ubjcotive oxcrcisos wliioh uro iisofiil in cniistruct- 
in'f tests in induslnal education are discussed and evaliuiied in this 
chapter. iMany dilferent fumis of test cxeiviscs have been used suc¬ 
cessfully in other subjects to objectify pupils' resiionses. Doubtless, 
new or luodificd tyjies will be develojied to meet future testing needs. 
Industrial education teachers should become critical of such pro¬ 
cedures, and should learn to select and use the tyjies of test exercises 
best adapted to their iiistriietional materials. 


59. Objective Techniques Adapted to Testing in Industrial Education. 


In the measurement of technical knowledge the usual types of ob¬ 
jective exercises can be used The measurement of manipulative abil¬ 
ity requires types of objective exercises designed specilically for the 
purpose. Two general types of test exercises are used in measuring 
information and manijaibitivc ability, namely, (1) the recall type and 
1,2) the uh'ntifleation type. In a recall exercise the jiupil is called 
ujion to suiqily the answer from ineinory. In the rceugnition or iden¬ 
tification tyjics, the inqiil nnist eiioosc the correct rcsiionsc from sev¬ 
eral jiussibilitios. The latter type involves the reealling of character¬ 
istics and rehitionsliips but does not call upon incmoi'y for the major 
items of the exercise. 


It would be hopeless to attempt to illustrate all the possible forms 
of objective exercises wliieh have been used m testing, but several 
examples are given hero which should suggest to the industrial educa¬ 
tion teacher in his construction of tests ways of meeting his measnve- 
raent needs in tlie classroom and shop. The different objective types 
should he studied carefully so that the teacher can readily select those 
best adapted to measuring diflerent jihascs of information and manip¬ 
ulative ability. 

lOG 
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60. Classification of Objective Test Exercises. 

Objective test exercises of types most likely to be of use in the in¬ 
dustrial education classroom and shop fall into the following classifica¬ 
tions ; 

I. Recall. 

A. Simple recall. 

B. Completion exercises with one or more key words omitted. 

C. Completion exercises with answers suggested or con¬ 
trolled. 

II. Recognition. 

A. Multiple-response tests. 

1. One correct response. 

2. Aliiltiple-answer exercise with varying degrees of 
merit. 

3. Multiple answers with one or more correct answers. 

B. True-false exercises. 

1. Yes-no questions. 

2. True-false statements. 

3. Diagram and true-false. 

4. Double true-false statements. 

C. Matching exercises. 

1. Word matching. 

2. Picture matching. 

3. Unbalanced column. 

D. Rearrangement test exercises. 

1. Order of operations. 

2. Classification. 

III. Performance. 

A. Quality or accuracy. 

B. Identification of tools and materials. 

1. Simple recognition. 

2. Recognition and analysis. 

C. Technique. 

D. Speed or rate of response. 

61. Recall Exercises. 

The varied forms of the recall-test exercises have been widely used 
in test construction in all fields. In an analysis of 375 tests from sev¬ 
eral instructional fields Conneau ^ found that nearly 30 per cent of all 

Ruch, G M , The Objective or New-Type Examination, Scott, Eoresman and 
Company, New York, Chapter VIII, p. 189. 
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the test exercises iverc of the completion typo. The recall exercises 
are of real value in testing information in industrial education, but of 
less value in measuring nianipiilative activity. 

The recall type of test exercise has gamed favor as a device for 
the measurement of information in all fields because it is almost en¬ 
tirely objective when properly constructed and administered. Guess¬ 
ing and chance factors operate very slightly. The recall question is a 
natural form of questioning and is easily and rapidly' scored. An 
important limitation of the recall tyjie of te,st item is that it tends to 
he merely' factual in character. The recall exercise also requires a 
great deal uf care in prciiaration, for the reason that unless the missing 
clue words arc carefully chosen several answers will he possible, which 
will make the scoring difiiciilt at times and bring in the subjective 
judgment of the toucher. In constructing a completion test it has been 
found advi.sable to have each blank call for a single idea, and to avoid 
a large number of blanks by omitting only a few key words. 


Sample Tec.^ll Exebcises 
Simple Recall 

1. Directions; Answer each of the following questions with a .»inglo word. 
Write the word on the line after the lost word of the question. 

1. What oil IS used in first-quality outside painfi* . .. 

2. In w'hal year was the Centennial Exposition held in Philadelphia? . ,., 

3. Who introduced the Russian system of imuiual tiainiiig into Amer¬ 

ica? . 

4. What liquid is commonly used to thin cabinet varnish? . 

2. Uhe.clions: After each fmisliing iimterial write the proper thinner. 

1. Shellac 

2. Varnish . 

3. Paint . 

4 Lacquer . 

5. Eniunel 

0. Kalsonnno . 

Completion Exercises 

3 Directions; The following statements arc to be completed by adding one, 
and only one, word in each blank, 

1. Oak is a good cabinet .... . 

2. The length of a meter is .... feet inche.s, 

3. Cabinet glue .should not bo heated above , , degrees Enhrenlioit. 

4. Wood .should not be , across the gram. 

5. The surface of a cabinet wood is prepared for finishing by . . . ,, 

., . and. 
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Complclion Excrdurs with Answers Sugffcslcd 

4. Directions: Complete the following senloneea by mseiting one of the words 
found m the list on the right of the page The words are to be used only once, 

1. A cabinet scraper is used to .... a surface 1. Squareness 

2. A try-square is used to test for . 2. Smooth 

3. A.13 used for cutting lengthwise of the grain 3. Mill 

4. A file i.s used to shape the edge of a cabinet 4. Ripsaw 

scraper. 

62. Multiple-Response Exercises. 

The multiple-response test is one of the most satisfactory objective 
test exercises to use in the mciisurcmeiit of information and reasoning. 
On the average it is somewhat more reliable than the true-false type, 
but is probably not so reliable as the recall test when the tests arc 
equaled for length in terms of the number of items in each. It is 
fairly easy to score, but not easy to construct. 

Guessing is a factor which must be taken into account in every ob¬ 
jective test form in rvliicli the single correct answer must be selected 
from two or more suggested responses. In theory, at least, guessing is 
reduced in multiple-response items by increasing the number of sug¬ 
gested responses. There is a practical limit to this, however, for it 
soon becomes apparent that it is impossible to select large numbers 
of equally plausible wrong responses for an item. If an exercise were 
made up with five responses, three of which were so obvious that they 
would be eliminated at once by a pupil with only a minimum of infor¬ 
mation, the test exercise would be no more effective than it would be 
if it were made as an alternate-response exercise to begin with. As a 
matter of fact, it would be made less effective by the inclusion of the 
useless material. The tendency at the present time seems to be in the 
direction of the three-response type. In any event, it is usually de¬ 
sirable to prepare the exercises with the same number of responses 
throughout the test, if it is to be corrected for chance by the usual 
formula. The factor of guessing in objective tests is discussed in more 
detail in Section 71 of this chapter. 

In the multiple-choice form of exercise the pupil indicates the cor¬ 
rect response by underlining or checking the answer, or by placing the 
number of the correct response on a blank at the end of the exercise. 
The writing of the number of the correct response rather than the 
response itself reduces the amount of writing required of the student 
and in general is quite satisfactory and objective. 
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S v\rrLK AIiTLTirLK-RKrti-oNSK Exeiichses—One Cohrect Answer 

5 Din'rftDiif:: I'lich of tlu> lollowmg .statrmciits can bi- corrodly coiuplcteil 
by oiif and oiilv one of the miinboioil otiui’^sioiia. A’on an; to wi'itc the immbov 
whiili .utanils for the confet fxpiossion on the line at the right of the excrciso, 

1. lao'Ciuor iH thinni'd with 

(1) turpentine' (2) ak-ohol (3) amyl acetate (-1) mineral oil , 

2 A shellac bruhli is cleaned with 

(1) tiiriientme (2) waf.m (3) alcohol (4) giifobne 

3. No. "00” sandpaiier i.s coai'-er lhau 

(1) No. "0” (2) No. “OGO’’ (3) No. 2 (1) No 5 

1. Ontsiile iiLiint i.s thinned with 

(1) water (2) paint, rciiioier (3) hn.seed oil (4) .aleoliol 

MuLTiriiE-ANKWEU I'iXKIK’lSES WITH ANSWER,S (IE VARYINti DEtillEE.S 

OF RIerit 

0 Direclionti. Underline the one word in the paientheses of each slatenicnL 
winch best coiinileU'S the slateincnt. 

1. (Walinit, bass, pine, babsa) is a favorite eabinet wood 

2. Mahogany i.s used for making (banels, ships, fiiruituro, fence posts). 

3. t'vpre.ss is u.scd in making (wat('r tanks, bods, floors, boxes). 

4. Red wood gro\v.s in (Indiana, Iowa, California, I.ouisiaua), 

Multiple-Answer Exeikusics with One oh More Correct Answers 

7 Dii'Ccti(»iK- Underline all the words in each parenthe.sis which will make 
true stateinenls. 

1. (Cypress, pine, redwood, elm) u.sed in making water tanks, 

2. (No. 2, No. “000,” No. 4. No. “00”) fine .sandpaper 

3. (Oak, pine, ha.sswood, ebony) *™soft wood. 

4. (Maple, walnut, fir, yellow jhne) good eabinet wood, 

63. True-False Exercises. 

The tnie-fiilsc or "yos-no” form of test exercise is one of the most 
popular types for measuring information. The true-false exorcise is 
objective, easy to score, has wide adaiitability, permits extensive sam¬ 
pling m short working periods, and if ingeniously devised may be 
used to measure reasoning as well as memory. However, it is not 
adapted to the inea.surcment of manipulative skills. 

High-quality objective exorcises of the true-false type are not so 
easy to construct as it might at first appear. Only materials which 
arc .strictly true or false should be put into true-false exercises. Double 
negatives and trick questions have no place m true-false questions. 
They should he stated in simple, direct language. The purpose is to 
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get an objective measure of the pupil’s knowledge, and not to confuse 
or bewilder him intentionally. Any true-false items that are likely to 
suggest the correct answer to other items should be widely distributed 
in the test. If reliable results are desired, true-false statements and 
other forms of test exercises should not be dictated to the class. 
Paterson ^ reports that dictating test items to a class tends to reduce 
the reliability of the test. If possible, separate copies of the test 
should be prepared for each pupil in the class. 

The chief limitions of true-false test exercises arc that they are 
open to the influence of guessing and chance factors, and also that they 
arc rather difTicult to construct so that the items will be strictly true or 
false without being too ob\'ioiis. Atteinjits to make them less obvious 
usually makes them ambiguous Both these limitations can be over¬ 
come to a large extent by the tliouglitfiil test worker if he will take 
unusual care in the construction of the exercises, and correct for guess¬ 
ing when scoring the test. 

Two types of alternate-response test exercises (true-false; yes-no) 
are recognized—the single and double types. Tlic single true-false 
statement is the more common type and has either a true or a false 
statement for each fact measured in the test; the double true-false 
lias both a true and a false statement for each concept in the test, 
both of which must be answered correctly in order for the pupil to 
score on the pair. The paired or double true-false test is a later form 
designed to control the effect of chance somewhat more definitely by 
having forced the pupil to respond to both a true and a false item on 
each fact or concept. The double true-false test undoubtedly docs 
eliminate chance to some extent, but the two test items which relate 
to the same point must be so distributed in the test that the pupil can¬ 
not make a direct comparison of them. A test made up of 100 paired 
true-false items has been shown to be more reliable than 100 items 
stated in the usual alternate form, but it requires the same amount of 
space and time as would be devoted to an ordinary true-false test of 
200 items. The reliability and the apparent validity of measurement 
resulting from the paired exercise test will be somewhat higher.-'' How¬ 
ever, the difficulty of making suitable statements of important or basic 
items in the subject-matter in paired form (in both true and false 
foiTu) is very great. Experience in formulating true-false exercises 
soon makes it apparent that certain subject-matter concepts lend 

2 Paterson, D. G., Conslrucling New-Type Examinations, Woild Book Com¬ 
pany. 

Greene, H A., “A New Con-ection for Chance m Alternate Response Exer- 
risGS,” Journal oj Educational Research, 17; 102-107, February, 1928. 
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thc'inKLTves to stataiiont in oiui form mucli more readily thiin they do 
in the other. In the dcvelopnieul of ordinary true-false examinations 
it will he found to he a very irood ])raetieci to yo Ihronnh the basic 
facts carefully, writing fir.st the false statement!? which arc to go into 
the test, and then follow with a like riuinher of true statements, which 
msually arc much more easily stated, 

Trnc-falsc exercises have been criticized by some writers because 
they were of the opinion that the prc.--entation of false forms had a 
negative effect on learning, studies by Hemmers and Remmers ■* and 
by Roberts and Rueh ' have shown that the negative suggestion effects 
are not .so significant as might be supposed Tlie evidenec indicates 
that true-false statements are a .stimulus to learning and are of real 
value, although this has not been emielusively proved. The hurilcn 
of proof now seems to rest with those who believe tliat the true-false 
question exerts a negative ctTeet. 


Samples oe Tttim-F.msE Exeucise.s 

Ti lU'-Fuhc iSOilt'Wf lU.s 


S. Dircciions' Examine' each stalomoni tadow and decide whether it is Imo 
or false. If true, utulorlme true, if false, uiulciliiie fab?. If an item is too 
hard, skip it and go on to the next one. Du not yacs.s This test will bo cor¬ 
rected for guessing. 

1. The caiburctor mamlain.s the coiroct pioportion of 

fuel and air at all .sjireds. True Ealse 

2. The generator on an automobile step.s up the primary 

current into a liigii tension spark. True False 

3. The transniis.sion makes pos-sible diffi-rent speeds for¬ 

ward and reverao. True False 


i. Tlic di.striliutor turns the motor over unlit it lias diawn 


gas in aiiil compressed it 



Fig 6.—Isometric Drawing of 
Radio. 


Tnie False 

Diagram and Tiuc-Fnhr, 

9, Dinrlionn Read the following state¬ 
ments about the drawing (Fig, 6) and 
mark them true or false by referring to 
the drawing. If a statement is true, undei- 
line the word true; if false, underline the 
word /(itse. Do not guess. This tost will 
be corrected for guessing. 


^Remmers, II. H, and Reimnor.s, E. M., “The Negative Suggestion Effect 
of True-False Examination (Jnestions,” Journal of Educational Psychology, Vol 
17; 52-50, 1926. 

■’ Roherla, H M , and Ruch, G. M , "The Negative Suggestion Effect of True- 
False Tests,” Journal oj Educational Research, Vol. 18:112-116, September, 1028. 
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1. The radio cabinel is 2' long 

2 The top of the cabiiint, %" thick. 

3 The width of the base is. 9". 

4 The depth of the cabinet is 8". 

5. The base of the cabinet is 1" thick. 

6 The length of the top is not given 

7 The front panel of the cabinet is 8" high. 


True False 
True False 
True False 
True F alse 
True False 
Tme False 
True False 


64. Matching Exercises. 

Matching exercises are valuable m industrial education for meas¬ 
uring relationships between items of information, or tools and ma¬ 
terials The pupil taking the test is called upon to recognize relation¬ 
ships between a test list and an answer list. Tlie pupil usually writes 
the number of the related item before or after the unnumbered item. 
Matching tests are objective, easily scored, and in certain subject fields 
easy to construct. They can be used in measuring factual materials 
or judgment In constructing matching exercises, it is important to 
have 10 or more items to reduce the operation of chance, but if very 
long lists arc used, 25 or 30 items, considerable time is lost by the pupil 
in sorting the various related pairs. An improvement in this practice 
is to arrange two separate groups of 10 or more items. 


Samples oe Matciiinq Tests 


Il'oid Matching 


10. Diieclions. Below are two cohuims of words which are rolaLed in meaning. 
Write the nuinljers of the worJ.s in the left-hand column on the blanks in the 
right-hand coluran so that they will show the items which are related For 
example. “Hammer” is number 1. You find the word “Nail” in the right-hand 
column. Place the figure 1 in the blank before the word nail. “ .1, .. Nail." 
Indicate the relation of all the other items in a similar manner. Do not guess 


1. Haiiiiner 

2. Saw 

3. Mortise 

4. Open-gram wood 
5 Turpentine 

G Shellac 
7. Wood 
8 Outside paint 
9. Bit 

10. Tin snips 

11. Knife 

12. Drawing board ® 


Biace 

Sheet copper 

Alcohol 

Linseed oil 

Sandpaper 

Tenon 

Nail 


.. Rip 
• Paste filler 
., -Varnish 
.Thumb tack 
.... Oilstone 


“ Riieh {The Objective or New-Type Examination, Scott, Foresman and Com¬ 
pany, p 227) suggests that, when it seems desirable to have less than 10 complete 
pans, an excess of statements be made in one column or the other to aid in 
caring for the chance element. 
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Pails Iihaitifimtion Test 

11 Dill rlwnt. Study the dnuviiiK of tlif' iiulomobilp pnginp (Fis. 7). No1,ii;o 
thill of till' (‘iismt' luii'ts iii'p iiuaihpi'f’il Hclow the drawing are the names 
of the engine pints wlileli are nuinheml in the (hawing. Write the niimhiTS 
appearing in the drawing in the hinnks oppohit.e the correct name for each part. 



Fig. 7.—^Ford V-S Mutor. 

Spark plug • - • Clu-s pump ..., 

Flxhiiust iininifuld - Generatnr - 

Wider punip .. . Air cleaner _ 

SI ni ter _ Triinsmi.ssiun - 

CiLiburctor - Crank nhnft - 

Breather pipe .... Cylinder head .... 

Di.stnbiitor - Fun .... 

Fan belt _ Engine supports _ 

66. Rearrangement Exercises. 

The rearrangement test is peculiarly adapted to testing informa¬ 
tion in indiistnal education wdiich involves chronological order, order 
of operations, and classification of materials according to grades or 
quality. It lends itself very well to measuring information which in¬ 
volves the employment of skills in sequence, or as a means of checking 
in a verbal fashion the plan of a job involving motor skills. 
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Rearrangement exercises which involve five to eight relations prob¬ 
ably do not need correcting for chance. If carefully constructed, re¬ 
arrangement exercises arc objective and easy to score. They are 
useful in diagnosing special difficulties as they are revealed by a pupil’s 
inability or ability to recognize proper relations in a test situation. 

Although this form of test exercise is quite difficult to construct 
and requires considerable space, it has been used successfully for meas¬ 
uring home mechanics. The following examples from the Newkirk- 
Stoddard Home Mechanics Tesf serve to illustrate its operation in 
this field. 


Samples of Reaiibangement Exercises 

12. Directions: On. tins and the following pages arc given a number of com¬ 
mon jobs in home mechanics. The proper steps foi carrying out each job are 
given here, but these steps are not placed m the coneat order. Examine each 
job in. turn, and decide which stop should come fust. Place the number of this 
step in the first pair of parenthcsc.s, that is, the parentheses at the loft In the 
same way insert the numbers of the remaining steps in the proper oider or se¬ 
quence, so that when you have finished, one can read the numbers in the 
paienthesGs fiom left to right and find out just how to cany out the steps in 
the whole job. 

Sample; Job; To Set Casters. 

(1) Drive caster-sheafs. 

(2) Select a bit and drill the hole. 

(3) Select a suitable caster. 

(4) Insert the caster and test. 

(5) Mark the point for the location of the caster. 

Reanangc the numbers to show correct procedure: 

(3) (5) ( ) ( ) ( ) 

13 Diiections, For the jobs which follow, the connections called for are to 
be indicated right on the diagrams by drawing in lino.s with pencil or pen. Read 
the directions for each job very carefully When you have figured out how the 
wires sliould go, mark them in neatly and clearly. If you don’t Imow all the 
connections, mark those you think aie right. 

Job 4. To Connect Three Dry Cclh in Paiallcl. 

Directions: Show the correct circuit by drawing lines between the black dots 



Dry Cells 


r Newkirk, L V., and Stoddard, George D . Newkirk-Stoddard Home Mechanics 
Test, Bureau of Educational Research and Service, State University of Iowa Iowa 
City, Iowa, 1928. 
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11. PERFORMANCE TEST EXERCISES 

66. Objective Perfonnaace Exercises. 

Objective e.xerciscs for tlie aica-siireiucnt of performance ha\'e not 
been developed to the point, of perfection that cliai'actenzGS many of 
the objective pencil-and-papcr tests of infonnation. There is need for 
much additional exiicninentation to evaliuite the most useful types of 
exercises for the measurement of performance. 

Such performance cxercise-s a.s have been developed are predomi- 
nantlj’ of the rec()p;nitiou type. Ttic object is to allow pupils to modify 
materials with tools or instruments or to rccoRiiizc types or qualities of 
materials and to cheek the resjionses in an objective manner. Ob¬ 
jective performance exercises must tell the pupil exactly what to do 
and not allow the order of major steps to depend on recall. Tor 
example, if it is desired to measure a pupil's ability to bore a hole in 
1-inch stock with a No. 6 bit, he should be given the bit, brace, and 
stock with directions, but should be carefully supervised to see that 
the directions are followed. This will result in a sample of the pupil’s 
work under standard conditions which cun be rated and compared 
with similar samples of other pupils' work. If, on the other hand, a 
pupil is told to get a No, 8 bit and to bore the hole and ho uses a 
No. 16 bit with which to bore the hole, the sample will not be entirely 
comparable with the No. 8 samples. If the pupil makes a mistake in 
the selection of tlie specified size of bit, it may indicate in a rough way 
that he does not know much about the sizes of bits, but it adds an 
uncontrolled variable to the performance factor without completely 
testing the pupil's knowledge of the sizes of bits. Knowledge of the 
different sizes of bits and ability to bore a hole are two different things, 
from the standpoint of tost construction. The situation is similar to 
that m which a teacher asks a pupil to write down the name of a 
cabinet wood, and the pupil ovrites the word "wallnut.” The pupil’s 
response is correct, but his spelling is faulty. The temptation is to 
lower the grade because of a misspelled word although the answer is 
correct. In this case the pupil should have a mark in spelling and a 
mark for his knowledge of the wood, but these two variables should 
not be allowed to interfere with each other. The same is true of the 
sizes of bits and the ability to bore a hole. They are independent 
variables both of which should be tested, but not in the same type of 
situation. 

Performance test exercises may be divided into four groups accord¬ 
ing to use, namely, (1) tests of quality or accuracy, (2) identifica- 
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tion of materials or tools, (3) technique of tools or instruments, and 
(4) speed or rate of response. These four ctmeepts, which have been 
discussed in Chapter V, 'vvill be briefly reviewed here with illustra¬ 
tions of test exercises which have been used in measuring the factors. 

67. Quality or Accuracy Exercises. 

The quality or accuracy of industrial education work is determined 
by carefully evaluating materials which have been modified in some 
significant way with tools, materials, or instruments. A test exercise 
for measuring quality of workmanship must allow the pupil to modify 
materials under genuine and controlled shop conditions, so that the 
results can be rated with reasonable objectivity and compared with 
the results of other pupils who have practically the same background 
and physiological development. The pupil not only must modify ma¬ 
terials under controlled conditions, but also must modify enough ma¬ 
terials to give an adeciuate or reliable sampling of the abilities being 
measured. 

The following guiding principles may be helpful in constructing 
objective test exercises of quality: 

1. Provide a job which will give adequate samples of the results of 
the tool or instrument operations being measurecl. 

2. Give specific directions for doing the work. 

3. Provide all tools and materials necessary. 

4. Measure the results by physical measurements, quality rating 
scales, and where necessary, by inspection. 


S.'i.MPLE or Quality or Acctjr.a.cy Exercises 

14. Operation: To saw to a line with an 8-point oioss-cut saw Saw as accu¬ 
rately as you can. 




to 






Eig. 8. 


Malenah: Eight-point cross-cut saw in good condition. A soft wood board 
free from knots, ts" x 6" x 2', suifaced on four sides and laid off as shown in 
the drawing. 

Directions: Place board in position for sawing. 
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15. Opeuition: To buru a hole wilh a Vi" "wood bit. 

Ttiolx and Miitniaih. A bit. ami brace, a jiiecc uf soft wood as indicated m 
the diaKiain, a lieiicb, viwe, and Iry-.'-quare. 


Dirndiamt: Uore holes lliioiiirh the wood block perpemlieular to the Binf.aco 
at (he pniiit iiidicaii.ed. 

10. Ojinialnpi: To cut a line iismi' tin .siiiiis 

Tooli anil Matrruih- A liiiir of .•'b.up, properly acljvistc'd (in t-iiips, a icieee of 
XX tin plate; as shown in the; diagnim with the euttmp! hue marked. 





Fiq. 10. 

Dircrltiina; Cut the tin into two 1" Sitrip.s. Cut directly on the line. 

Examples 14, 15, and IG are suinplps of short manipulative test exer¬ 
cises designed to measure ability to saw to a line with a cross-cut saw, 
ability to bore a hole iierpendicular to a surface with an auger bit, and 
ability to cut strips of tin with tin snips. Many tool operations can 
be tested in this manner. The construction of complete tests of qual¬ 
ity or acouracy is discussed in detail in Chapter XI. 

68, Identification Exercises. 

Identification exercises arc very useful for testing the pupil’s ability 
to rocogiUKC materials, instruments, and tools. They arc also used fcjr 
measuring a pupil's ability to an.alyze .special dilBcnlties. The fol¬ 
lowing signific.ant jirinciples in the construction of identification excr- 
cisos should prove suggestive to the teacher; 

1. Provide a representative sample of the objectives to be iden¬ 
tified. 

2. Suspend materials so that they can readily he examined 

3. Score the items by checking the objective written responses. 

The identification exorcise is easy to use and is objective in scoring, 
and the same sample panel can be used by changing for testing the 
pupil’s ability to identify a number of different materials, fixtures, or 
tools. The authors have found it advantageous to suspend the items 



Pig. a. 
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beciiiisc it allows the pupil to hold the items in his hands, to lift them, 
to smell them, etc. This gives a natunil psychological approach. It 
also prevents certain optical illusions. For exam])le, it is difficult to 
realize that a 6 penny nail is not a 3 penny common when it is fastened 
securely beside a 60 penny spike. 

Samples of Identification Exercises 

Idenlificalion of Materials 


17 Dupclions: Number your paper from 1 through 8 along the left-hand 
margin. OppoaiLe each number write the name of the wood that is hung under 
the corresponding number on the panel. 



Fra. 11. 


Analysis and Identification of Defects in Bells and Buzzers 
18 Directions: Number your p.aper from 1 through 6 along the left-hand 
margin. Opposite each number write any defect in the bell or buzzer that is 
hung under the corresponding number on the panel. Do not guess. 
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Testing }or Dcjcctivc Fuses 

19. Directions: Nuiiibor your paper from i through 6 along the left-hand 
margin. Test out the fuses with a tost lamp. Opposite each number on the 
paper indicate whether the corrcRponding fuse on the panel is blown or satis¬ 
factory. Do not guess. 



Fra. 13. 


69, Technique Exercises. 

Technique exercises arc designed to measure a pupil’s method of 
manipulating tools, machines, instruments, or materials. It is pos¬ 
sible to do work of good quality Avitli poor technique, but great skill 
can scarcely be developed without the fund.amental techniques with 
tools .and materials. Shop technique docs not lend itself to mca.surc- 
ment with complete objectivity. The technique exercises require the 
pupil to do certain things which demand the manipulation of tools and 
materials, which then provide a means of rating the major techniques. 
Test exercises of technique, like exercises of quality, require thought 
and experimentation in their construction but are valuable in meas¬ 
urement, diagnosis, and teaching. 

The teacher will do well to consider tlie following guiding prin¬ 
ciples in the construction of test exercises for measuring technique: 

1. Provide activities which will call for the use of tools or in¬ 
struments in which technique is to be rated. 

2. Give specific directions for doing tlie work. 

3. Provide enough activity to give adequate samples of the vari¬ 
ous techniques. 

4. Provide necessary tools and materials. 

5. Rate the techniques by using a rating scale. 






SPEED OR RATE OF RESPONSE EXERCISES 


121 


Test Exeuclse on Technique 
20. Opcialion- To saw to a lino. 

Dirccliuns' Saw the board on the hue as marked, perpendicular to the surface. 



Fm. 14. 


Tools and Matoials: Ripsaw and cioss-ciil saw m good condition. Bench, 
vise, and a piece of soft wood maiked as indicated in the drawing 

Rating Scale.^ Aa the. pupil makes the cuts with the saw, the fol¬ 
lowing points are observed and checked. Each item is rated on the 
basis of 10, and the score is determined by adding the ratings. 


Sawing 

1. Clamping stock. IIIIIIIIl 

Stock should be held so that it will not be loosened or cracked and 
should also facilitate sawing. 

2. Starting cut IIIIIIIIl 

With thumb at hue, saw should be placed against the thumb Saw 
should be pulled back slowly a few times to make a groove, then pushed 
foiward. 

3. Holding saw. II IIIIIIl 

Saw should be held in right hand. For cross-cut, angle should bo 45 
degrees; for rip, 60 degrees. 

4. Stroke IIIIIIIIl 

Stroke should be long and even, not too fast. Proper angle should bo 
kept during sawing. Line should be followed. 

5. Ending cut. IIIIIIIIl 

One should reach over with the loft hand and hold on to the piece 
being cut off. Saw strokes should be slow with little pressure to prevent 
breaking off the end. 

70. Speed or Rate of Response Exercises. 

Rate of response is of considerable value in trade courses, but of 
less importance in the cultural courses of the elementary and junior 

® Sample 20 gives the method of rating a shop technique. The construction 
of tests for rating techniques is discussed in more detail m Chapter XI. 
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liiRli sclifiol. Tests of speed -will be discussed and illustrated in detail 
in Chapter XI, but it seems desirable to point out here that quality 
must be clearly defmeil and held nearly coubtant or rate of response 
cannot be measured accurately. Speed and accuracy arc each vari¬ 
able factors in aelucveinent and perfonnaiiec. Test exercises designed 
to measure speed or rate of response must present a well-defined ac¬ 
tivity r\'ith appropriate standards. When the pujiil can do the activity 
and meet the required standards, then he is ready to take the test to 
sec how rapidly he can do the iirohlem. Practice will result in a 
pupil’s improving his score up to the point where further increase is 
limited by native ability. The amount of speed a pupil should have 
can be determined by the demands of the job or by comparison with 
the be.st efforts of others. 

Ill the coiistriietioii of exercises designed to measure rate of re¬ 
sponse the following principles should be observed: 

1. Define the exact work to be done. 

2. Give definite standards. 

3. Give the pupil a chance to achieve the standards and learn 
exactly what they arc. 

4. Do the job on a carefully controlled time basis. 

5. Score on the basis of known quality and tune. 


S.^.Mi’LE R.vte of Response E.xeuoire 

21. OpcxiUuiL' KuU' oC iMji'ing holes lluough %” soft wood 

Part I 

1 rcliminary Activity Bou! llireo 14” holes thioufih the piece of soft wood 
given you at thi_- poinl.s iiidicated in lh(‘ drawing and on the board 
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Fig. 15. 


Practice boring holc.s until you can do as well as or better than the sample 
given you by the teacher. 

Part II 

1 Say lo the pupil, “Now that you can boip holes aecuratelj' and neatly, w’e 
want to learn how quickly you can bore Uuee holes. Do your work the same 
way yon did ra practicing. Continue to boro holes which are as good as or hotter 
tiiiia the sample " 
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2. Be sure that the pupil has a piece of wood remly for boring and that the 
bit iH in the brnco and that the vise is in ivorkmg order. 

3 Say to the inipil, “Ready, begin ” Cheek the ainoimt of time in .seconds 
that is required to bore the holes. 

4 Score the exercises by retaining each sainjilc of boring that is as good as or 
better than the niinimuni sample. For example, the pupil bore.s the three holes 
in 00 seconds. Two holes are batisfactoi’y, but one i.s inferior, so the pupil’s score 
on the exercise is two holcss m .sixty seconds. 


in. CHANCE FACTORS IN OBJECTIVE EXAMINATIONS 
71. Guessing in Objective Tests. 

Tc.?! exercises of the recognition type, in which one or more sug¬ 
gested wrong responses accompany the correct response, arc definitely 
affected by the factor of chance or guessing Ordinary recall items 
which call upon the student to initiate and state his response naturally 
are not influenced by this factor. Most alternate-response (irue-false; 
yc.s-no) items open up the possibility of a fifty-fifty chance of the 
individual’s guessing the correct answer in all items about which he has 
no information at all. Multiple-rcapon.se items of tlie three-, four-, 
or five-response types decrease this probability as the number of alter¬ 
nate responses is increased. Within certain limits, chance operates in 
matching exercises, and to a smaller degree in exercises using the re¬ 
arrangement or the classification testing techniques. 

The actual degree to which chance affects a pupil’s score is almost 
impossible to determine. It depends upon the form of the test exer¬ 
cises and their arrangement in the test. It also depends upon the 
amount of information, or lack of it, which the pupil has concerning 
the specific item. If we reason from an a priori basis, it is quite ap- 
p.arent that the pupil who is totally ignorant of the facts involved in a 
test item has a fifty-fifty chance of guessing the correct response in a 
lruG-fal.se or two-respon.se test. For instance, if an individual were 
to respond to a properly balanced true-false test with the exercises 
themselves covered with a sheet of paper, by marking at random the 
true-false responses along the margin of the paper, this would repre¬ 
sent a situation in which total ignorance of the items actually oper¬ 
ated. If the test ivere long enough to provide a reasonable sampling, 
the resulting score on the test under these conditions should be zero, 
since no knowledge of the test content would be called into play in 
responding to it. Pure chance would be operating. Under these con¬ 
ditions the individual should mark almost exactly the same number of 
wrong responses as right ones. If the number of right and wrong 
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vc.spoTiPGs did not clieck cloKcly, it would lueaii that the test itself was 
not properly baliinecd, or that it was not long enough to be reliable. 

Now in actual practice, there should be very few items in an exam¬ 
ination about uhieh the student .should be totally ignorant and be 
forced to resort to pure guessing, if the test is made up of valid items, 
Kclocted from the actual content wliicli the student has had an oppor¬ 
tunity to learn. Accordingly, this fringe of knowledge, slight though 
it may be, should enable him to succeed more often than he fails. In 
other words, pure gues.sing docs not operate in the use of a valid alter¬ 
nate-response test. In theory, guessing would operate to increase the 
score tlirnugh a lucky guess just as often as it would tend to reduce 
the score through an unlucky guess. The fact that the pupil is never 
so ignorant of siieli a test item as this reasoning would assume, makes 
it desirable to conclude that he at least guesses one exercise right for 
each one guessed wrung This results in rcilucing his score in the num¬ 
ber of exercises right by tlic number of exercises he mis.sed in the test. 

The best available evidence .seems to indicate that the apparent 
validity of alternate-response tests is increased slightly by assuming 
that guessing actually docs take place and correcting the score on that 
basis. The net result of the application of this type of correction is 
possibly to over-correct slightly, but in most cases tliis is not serious. 
Ruch “ has suggested that if correcting for chance in recognition forms 
appears to be unsatisfactory, approximately the. same effect may be 
brought about tlirough increasing the length of the test by the use of 
10 to 15 per cent more test items than would be recpiired for the ex¬ 
pected reliability of measurement. 

72. Correcting for Chance in Objective Tests. 

The typical procedure for the correction of exercises for the opera¬ 
tion of chance may he generalized in the following formula; 



in which C is the corrected score, R is the number of exercises an¬ 
swered correctly, W is the number of exercises answered incorrectly, 
and N is the number of choices in the exercise. Thus, if there are 5 
choices in a multiple-response examination, the correction consists in 
taking of the number of wrong answers from the number of exer¬ 
cises answered correctly. In true-false or other alternate-response 
tests, the formula works equally well. In true-false tests N equals 2. 

“Rudi, G. M., The Objective or New-lype Examinalion, SooLt, Fovesman and 
CoiniJany, ChiciiEO, 1929. 
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Thus, tliQ denominator of the formula becomes 1, and the net result is 
to deduct the number of wrong responses from the number answered 
correctly. For example, a student in responding to a true-false exam¬ 
ination consisting of 125 items, omits 11 and answers 14 incorrectly. 
The number of exercises he answered correctly (R) is found by sub¬ 
tracting the omitted and incorrectly answered exercises from the total 
number of items in the test. 125 — 11 = 114; 114 — 14 = 100, the 
number right. The correction for gucs.sing involves taking the wrongs 
from the rights (22 —TF). Accordingly, the corrected score is 100 — 14, 
or 86. In cases where the student misses more exercises than he 
answers correctly, the practice ordinarily followed is to assign scores of 
zero, rather than to show a negative score. Practically, the individual 
could scarcely know less than zero, and furthennore, it is likely that 
such a situation arises out of the unreliability of the test itself. 

Attention should possibly be directed once more to the matter of 
the specific instructions to be given the student in the use of recogni¬ 
tion-type tests. The best practice, based on a conservative estimate of 
the available evidence, seems to be to direct the pupils not to guess in 
taking the test, but to correct the resulting scores on the test exactly 
as if they had guessed. The only exception to this general rule seems 
to arise in the use of double true-false exercises. In this ease, it ap¬ 
pears desirable to encourage the student to attempt to answer every 
possible exercise in both parts of the test. The method of scoring the 
test in terms of pairs right takes adequate care of any tendency or 
necessity on his part to resort to pure guessing. Furthermore, unless 
the items are utterly invalid, he must have a fringe of information 
about many items which he might be tempted to omit under the con¬ 
ditions of the typical true-false test. Since missing or omitting one or 
both of the paired exercises makes it impoissible for the pupil to score 
on that pair, he should be given the benefit of the doubt and a chance 
to score on every pair of exercises. 

IV. TYPES OF ESSAY-TYPE EXAMINATION EXERCISES 

Although a general program of measurement of classroom prod¬ 
ucts in the industrial arts is most likely to be advanced through the 
elimination of the subjective features of the teaclier’s judgment at all 
passible points, it must nevertheless be recognized that certain de¬ 
sirable products of the classroom and shop simply do not lend them¬ 
selves to the objective approach. Furthermore, many teachers of in¬ 
dustrial education wish to make use of essay-type tests occasionally 
for other reasons. Since this type of test is still used, and probably 
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always will Le to some extent, it is imqucstionably desirable to point 
out here some of the ]>o.ssibilities aiul liinitations of this type of ineas- 
ureiiH'iit, ns widl as to submit eertain sugf^cstions which if carefully 
utilized may result in the distinct improvement of the less objective 
methods of measurement. 

Essay-tyjic examinations, though generally not so reliable as the 
average objertive examination, frequently secure measures wliich are 
just as valid as they would be if stated in objective form. The lack 
of reliability in the essay or tradilional examination lies mainly in the 
limited extent of the .sampling whieh the use of this form of question 
permits, and m the lack of objectivity in scoring the items. Many 
examinations composed entirely of c.ssay questions are valid in the 
general smise of tlie term. Their limitation results from the incom- 
j)lctenc.ss of the samjiling taken and from the uncertainty with which 
the resuHs are evaluated. 

73. Traditional Examination Questions. 

The traditional or disciissioii-type exuminatiun is almost uniformly 
made up of recall questions. The following types arc representative: 


I, SiMI'I.n HEC.ILL, 

1. Name [uiir difTcrciit tyjie.s of wood stain.s 

2. Natiio the different grades of sanditap(.’i u.sed in woodflni.shing. 

3 Name the ingredients of paste wood filler. 

II. Desckiption. 

Sdnijili's: 

1. IVliy i.s nalmit ii good cabinet rvood? 

2. Wh.it arc the ducf ehai net eristics of red wood? 

.3. Why i.s hal.'-a n favored wood for comslnieting model air craft? 

■1. What aie the characlerislics of quartor-sawod oak? 

III. COMl’.MU.SON Ar«p .INALYSIS. 

Hum idea: 

1. What is the difference between a. superheterodyne circuit and a 
rndio-frequcncy circuit? 

2 How does a dynamic speaker differ from a magnetic one? 

3. What IS the difference between an inside aerial and an outside 
aerial? 

‘1 How doe.s a battery set differ from an a-c.set? 

S. Is there a difference in the underlying principle of head phones and 
a magnetic speaker? Explain. 
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IV. rnOCEDUUB. 

Samples ; 

1. Give the steps in applying a lubbed varnish finish. 

2. Give the steps in squaimg a board 

3. Give the piocedure for fuming oak. 

4. Give the proredure for tinning and soldering copper. 

74. Constructing Essay-Type Exercises. 

On first thought the essay-test exercise seems easier to prepare 
and use than the objective type. In the way the traditional examina¬ 
tion is ordinarily used, or perhaps we should say misused, it does take 
less time to prepare and the results obtained are much more sub¬ 
jective and unreliable than those from objective tests. If the essay- 
type test is constructed so as to give a fairly reliable result it is not 
easier to construct than the objective test exercise. In fact, it may 
demand a great deal inure time and careful thougiit. 

The following rules have been found very helpful in the construc¬ 
tion of essay-type questions. 

1. State the question in a simple, direct manner so that it demands 
the reproduction, comparison, or evaluation of a specific unit of in¬ 
structional material. 

Example; Poor, Name all the staina you can. 

Belter. Name four types of wood stain.? 

2. Write out just exactly the answer that is expected for each 
essay question in the test. This may be either in outline, brief para¬ 
graph, or diagram. 

Exampujs: 1. Name lour types of wood stains. 

Teacher’s answer: 1. Water stains. 

2. Oil stains. 

3. Spirit stains. 

4. Chemical stains. 

2. Why is walnut better suited to cabinet work than fir? 

Teacher's answer: Walnut is bettor suited for cabinet work because of its 

natuial beauty, color of the wood, close grain, .and durability; it is stronger 

than fir, does not splinter as readily and takes a better finish. 

75. Scoring Essay-Type Test Exercises. 

The more objectively the essay test exercises can be scored, the less 
the results will be influenced by the personal judgment of the scorer. 
The following suggestions have been found valuable for use in cor¬ 
recting essay-type exercises: 

1. Tests should be scored by the one who makes out the ques¬ 
tions. He should know exactly what responses are intended and write 
them down. 



128 TESTINCi TECHNIQUE;! IN INDUSTRIAL EDUCATION 


2. Eixch impil taking tlie test should write Ins name on the hack 
of the test ])aper, and the scorer should disregard the name until the 
test IS scored. This elinnnates the subjective factor of being influenced 
or biased in judgment because of former contacts with the pupil. 

3. The scorer .should not mark off for mi.s.spclled words, sentence 
structure, paragraphing, poor writing, etc. Similarly, he should not 
increase llic score for excellence in these things. Howei^er, such fac¬ 
tors may be indicated or checked on the examination. The reason for 
this is tliat the test is to measure the pupil's knowledge of certain in¬ 
formation in an industrial e'diication course. If it is desirable to test 
a pujiil’s ability to write, spell, or use correct written English, suitable 
tests slmiild be given fur this purpose which arc valid and reliable. 

i. E.«say test exercises can be corrected most simply by correcting 
each item in all the tc‘s(s rather than by correcting the entire tests 
separately. This ciiuble.s the scorer to concentrate on the answer to 
one test exercise and thus he is hotter able to judge the merits of the 
several jmjul re.si)onses to tlic same question. 

5, Rate each question on. a scale of 10 or 20 and then add the 
rating.? on all the o.ssay test exercises to get the mark for the paper. 
This method helps to objectify the score. The score is based on a 
number of careful judgments rather than on one complex judgment 
for the entire score. 

The essay-type exercise can be made much more objective and the 
subjectivity of the teacher’s marks can be significantly reduced if a 
method similar to the one outlined in the preceding paragraphs is fol¬ 
lowed. However, certain limitations of the essay-type question are 
obvious. Ercqiiently the time gained by the teacher in jireparation is 
lost in scoring. At best, the essay-type te.st is not as valid or reliable 
as an average objective-type test. Research studies have shown the 
reliability to be around 59 on an average. 

Kelly and Fauber and Ruch” have shown that subjectivity of 
teachers’ marks can be reduced significantly through the use of scoring 
rules. But even if the subjectivity of teachers’ marks could be reduced 
by half, tlie}’^ still would not prendde measures which are nearly so 
reliable as those obtained from objective tests. Rueh states that 
“Experience and experiment have shown that the results of an essay 
examination cannot be evaluated fairly by human minds.” In addi¬ 
n’ Kelly, E. J,, Teachers’ Marks, Teachers College Contribution to Education, 
No 66, p. S.3, Columbia University, Now York, 1914. 

u Unpublished master’s thesis, 1926, University of Iowa. 

Rueh, G. M., The Objective or New-Tj/pc Examination, Chapter I, p. 20, 
Scott, Poreaman and Company, Chicago, 1929. 
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tion to the psychological didiculty of making a complex judgment, 
there is also the serious (hsadvantage of limited sampling, which has 
been mentioned previously. The average objective tests will sarhplc 
at least five times as widely into a field of information as an essay 
examination requiring the same testing time. 

SUMMARY 

The more important technique.'^ for use in the construction of tradi¬ 
tional examinations and intormal objective tests are summarized in 
this chapter. The differences between the standardized test and the 
informal objective examinations are pointed out. 

The problems involved m controlling the chance or guessing factor 
in certain forms of objective tests are treated briefly in this chapter 
because of the close relation of this factor to the technique of relia¬ 
bility of measurement used Recognition is given to the fact that 
not all the measurement that goes on in the classroom and shop should 
be objective. 

SUMMARY EXERCISES FOR DISCUSSION 

1. Why are iiiairy of the iinpor-and-pcucil tests which aio useful m other eduen.- 

tioual fields not well suited to the demands of objective measurement in 
iiidustiial sidijocts? 

2. Illustrate by example each of the main typos of objective exercises suited for 

use in mcasurciuent in industrial arts courses. 

3. What are the main advantages and disadvantages of objective examinations? 

4. Show how the general foiiuula for correcting for guessing in objective tests 

actually works in nn alternate-response test, and in a five-response test. 

5. Evaliialc tlie suggestions for inipioving tlio objectivity of scoring of essay- 

type cxerci-ses. 
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CONSTRUCTION AND USE OF INFORMAL SHOP TESTS 

I. CONSTRUCTION OF AN INFORMAL OBJECTIVE TEST 

76. Steps in Building an Objective Information Test. 

In contrast with certain other school subjects, the industrial arts 
subjects present two widely different phases of achievement for meas¬ 
urement. One of these pliases is expressed in terms of the ability of 
the individual student to go into the shop, and, by following specific 
directions, come out with a product of a given quality which is itself 
evidence of achievement, This is a test of performance. The other is 
expressed in terras of knowledge of facts and their relationships which 
may lie back of the student’s actual performance. This is a test of 
information. The performance test calls for direction, action, produc¬ 
tion. The information type is usually a paper-and-pencil test. It is 
obvious that the best possible test of information cannot be wholly 
valid for any industrial education course because of the fact that it 
deals only with information and does not measure such factors as 
quality or rate of response, techniques, and personality traits, which 
unquestionably arc important elements of the course. Both are essen¬ 
tial to complete measurement of accomplishment in this field. The 
testing techniques for both types of tests are discussed in the pre¬ 
ceding chapter. The steps in constructing informational types of tests, 
and in deriving rating scales for the evaluation of the quality of prod¬ 
ucts obtained under performance testing conditions, are set forth in 
this chapter, 

The distinctive feature of the teachcr-raadc objective examination 
which makes it especially useful in the evaluation of classroom 
achievement is the closeness with which its content can be made to 
parallel the subject-matter actually taught to the class. This is 
merely another way of stating that its validity is high in proportion 
to the extent that the teacher includes in the examination exercises 
sampling from facts which the students have had an opportunity to 
learn. A teacher who knows his pupils and his subject-matter may 
readily construct an objective examination which will have all the 
merits of the standardized test (except the standards or norms tliem- 
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solves), few or none of the limitations Critical selection of items 
from the important jihasc.s of the subject which have been given in¬ 
structional emphasis will guarantee the validity of the test. Ob¬ 
servance of the simple principles of formulating objective exercises 
will produce an objective test. A wide sampling over the significant 
phases of the subject will produce a test long enough in terms of 
items and working time to secure rchahlc results. Teacher-made tests 
which meet those three criteria leave little to be desired. 

77. Securing Validity. 

Objectivity and reliability in an examination arc qualities which 
are functions of the fomi of the exercises used and the breadth of 
sampling of items taken That i,s to say, they arc the results of the 
application of certain principles of measurement which can bo learned 
by any elassrooin teacher. The first con.structivc step in the develop¬ 
ment of the informal objective examination is the establishment of a 
basis for the validation of its content. 

Course of Study as the Basts far Validity. The validation of an 
exammatioii presumes a kiiowdedgc of exact details of the curricular 
material to be taught, as well as a background of e.xpericnce and judg¬ 
ment adequate for a critical evaluation of the social and practical 
significance of the various units of instruction. This means, clearly 
enough, that the teacher must have an intimate knowledge of the 
content of the course of study. 

In order to build up a suitable background for understanding the 
development of the objective tests presented later in this chapter for 
purposes of illustration, the following course-of-stiuly outline in wood¬ 
working is presented. The content of this outline is not put forward 
as ideal, but rather ns a body of information and skills affording ma¬ 
terials suitable for illustrating several types of tests useful to the in¬ 
dustrial education teacher. The illustration from woodworking is used 
here mainly because it is the most widely taught and best understood 
instructional division in industrial education, and because practically 
all the testing techniques suitable for use in this field may be applied 
to other industrial education subjects. 

Course-of-Study Outline in Woodworking. This outline is pre¬ 
sented as a definite basis for the construction of an objective examina¬ 
tion for an eighth-grade class in woodworking. The unit of instruc¬ 
tion covers ten weeks of woodworking representing one of four 
instructional divisions of a course in general shop. The class itself is 
made up of twenty-six boys coming from middle-class homes typical 
of a mid-western city. 
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Objectives op the CounsE 

1. To develop nn appi'cciation of good mutoriiils and woi'lcmanship. 

2. To develol) haiidynuin abilities willi coiiiiuoii tools iitid materials. 

3. To develop hobbies fur leihure-t.iiuc activities. 

4. To further intelligent choice of life occupations. 

5. To give information about the iiulustrics and then woikcis. 

6. To develop desirable social traits and atlitudcs 

7. To provide opportunity for planning and pioblem salving. 

8. To mutivato and idtalize academic learning, 

TTAat the hoys kIluuUI he ahln to do with wooduioi kin(j tools * 

1. To use a rule in measuring. 

2. To use divideis or coiiipa.s.s for laying out curves and dividing sp.acc.s. 

3. To use a Uy-squai'c for te.stuig 

4. To adjust a plane, 

5. To square a piece of stock 

• 6, To saw to a line witli a I'qi or cross-cut saw. 

7. To use back .saw. 

8. To use coping saw 

9. To bore holes in wood. 

10. To fasten with screws 

11. To tiiin or paie with a cliLsel. 

12. To use scraper. 

13 To use sandpaper. 

14. To drive and draw nails 
15 To lay out and cut a chamfer, 

16. To glue up woik. 

17. To fit hinges. 

18. To make butt joint. 

19. To make dowel joint. 

20. To shiu'jicn edge tools. 

the hoys should know about wood and the diviitiom of the industry 

1. Know the principal characlerislics, working qualitic.s, principal uses, and 
source,s of supply of the following woods: pines, cypicss, oak, walnut, ash, bn eh, 
maple, mahogany, red cedar, hickory, gum, chestnut, and poplar. 

2. How hiiiibei i.s cut and milled. 

3. Standard dimensions of lumber 

4. Knowledge of veneer and plywood. 

5. Kinds of glue and its preparation. 

6. Kinds of nails and their uses. 

7. Kinds and sizes of screivs 

8. Kinds and grades of sandpaper. 

9. Grades and U5e.s of steel -wool. 

10. Distinguishing eharacteri.stic.s of period furniture. 

I Adapted from the A. V, A. Comniittco’s Ropoit on Standards in Industrial 
Arts 
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11. Basic iinin'iiik'.s of Rood design in furniture. 

12 U.-io of comnum (yiies of hinges iincl fa.stencrs on woodwoi'king projoefa. 
13. Kinds of griiuUng and .slvarpening stones. 

11. Location of munufacturing concerns iind labor conditions. 


Whal Iho attitude oj the boys should be" 

1, IndlustriouK. 

2, Ckioiierativfi. 

3, Self-reliant. 

1. Consideinte of the liglils of others. 

5. Ready to iisniime responsibility. 

6. Loyal. 

7. Fair minded. 

8. Optiiiii.stic toward life. 

9. Law abiding. 

10. Appreciative of duty in eominon IhingR. 

Bclcction of j\fajor Groups of Informational Itcnift. The next step 
in the validiitiim of the content of an informal objective test of infor¬ 
mation IS to select, in the light of the objectives set up for the course, 
the groups of skills rvlneli are informational in character and which 
can be measured by means of a paper-and-pencil test. The following 
sumniaries represent the major groups of such informational aspects 
found in the foregoing outline on woodworking: 

1. Different types of planes. 

2. Diffeient typoa of saws. 

3. Sizes of wood bits 

4. Sizes of sercw.s. 

5. Kiiid-s and sizes of chisels. 

0. Froucduic in sipiuring slock. 

7. Sizes of .sandpaper. 

S. Sizes of nails. 

9 Glue and its use. 

10. Diffeicnt lype.s of hingo.s. 

11 Kind.s of wood stain. 

12. Typos of fillers 

13. Different typos of brushes. 

14. Composition of shellac. 

15. Enamel and its composition. 

16 Varnish and i(s composition. 

17. Different kinds of paint. 

18 Composition of wax. 

19. Compo.sition of lacquer. 

20. Common joints. 

“ Thi.s unit ia the same for all shop subjects, and probably for the entire 
curriculum. 
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21. Slops in applying stain, UIIgi', .shellac, varnish, enamel, paint, wax, and 
lacquer. 

22 Stops m squaring stock, preparing glue, u.sing sandpaper, sharpening wedge 
edge tools, boring holes, and fastening with screws. 

23. Principal characteristics and uses of common woods. 

21. Dimensions of lumber 

25. How lumber is cut and milled. 

20. Veneer and plywood. 

27. Grades and uses of steel wool. 

28. Principles of design. 

29. Characteristics of widely known period frirnituro. 

30. Type.s of grinding and shar|icning stones. 

31. Manufacturing concerns and labor conditions. 

Sucjcjcstions for Socurmg a Valid Bampling of Informational Con- 
tent. The .specific problem of this discus.sion is to demonstrate how it 
is possible to secure a valid sampling of the informational content of 
the course. The following rules will be found very helpful in accom¬ 
plishing this; 

1. Keep clearly in mind the objectives of the course. Try to for- 
nudate questions which will measure the extent to winch the objectives 
have been achieved. Emphasize the relative and social utility of the 
subject-matter and avoid purely factual questions unless they are 
essential to building up concepts. 

2. Ask questions which the objectives indicate are of most im¬ 
portance, but under no circiunstaiiees ask questions included merely 
to “stump” the pupils. Trick questions, and unusually difficult ones, 
arc only dead weight in the test, waste valuable testing time, and in 
general lower the validity of the test. 

3. Ask a large number of questions over all parts of the course. 
The different types of objective test exercises are best suited for testing 
a large number of items in the time ordinarily allotted to measure¬ 
ment. 

4. Have other teachers make suggestions as to the importance of 
the exercises selected for the test. Take into consideration the com¬ 
ments of pupils as to the value of the different test items. If the pupils 
consider tlieni unfair, obscure, and too easy, they should be eliminated 
or modified before the test is used again. 

5. The test cannot be more valid than the course of study on 
which it is based. The progressive teacher will revise his course and 
tests from time to time to bring the work abreast with good practice 
and the results of curriculum research. 

In establishing validity it is a good policy to construct 200 or 250 
test exercises based on the course. This furnishes sufficient material 
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to eliminate uiiclGsirahle teat exercises and still have adequate material 
for at Ic'tiat two forma of the test. One form is sufficient for a valid 
oxiuniiiation, hut two forms make it more valuable. The second form 
may bo used for testing tho.se absent from the first tost, or the test 
may be used from year to year, alternating the two forms. 

78. Securing Objectivity. 

After the innior informational topics have been agreed upon in the 
light of the teaching objectivc.s and, when possible, passed upon by 
other teaehcr.s, the iti'ins can he expanded and developed into objective 
test exercises. It is a good policy to use the type of objective exer¬ 
cise winch be.st fits the material, ratlicr than to attempt to make all 
exercises conform to a single type, as true-false, multiple-choice, etc. 
Chapter X gives examples of test exercises which have been found 
valuable in testing information in shop courses. 

Two fairly satisfactory methods of procedure are suggested for 
recording the test exercises as they are developed. In one, the teacher 
may write tlic exercises on sheets of paper allowing a half inch between 
each question, so that, after the que.stions have been formed, the paper 
can be cut into strip.s with one question on a strip. These strips can 
be shifted to eliminate the less desirable ones according to the teach¬ 
er’s plans. Similar types of test exercises can be grouped to save time 
in manipulation of tlie items. In the oilier method, the teacher may 
use 3 inch by 5 inch cards and a card index. Each question is put on 
a separate card, and the cards are grouped according to the type of 
test exercise used. The first draft of the questions in either procedure 
should be double spaced to allow for corrections by the teacher after 
he has given them critieal analysis. The cards can be shifted or elim¬ 
inated as desired. The authors have found this second method to be 
handiCT and neater but a little more expensive. 

After the test items have been developed and classified according 
to types of ciuestioiis, the next step is to develop suitable directions 
and sample exercises for cneh different group of test exercises. Direc¬ 
tions for tests must be clearly stated and in a vocabulary that the 
pupils can comprehend. If long difficult words have been used, the 
teacher should attempt to find synonyms which are in more common 
usage. In addition to the directions, it is important to provide sample 
exercises to give the pupils experience in employing the testing tech¬ 
nique demanded. Pupils rvho have had little or no experience with 
objective exorcises will be done an injustice unless they are given ample 
directions and practice on the types of exercises used. They may make 
low scores because they do not understand what to do, rather than 
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bocaiise they do not know the eon-cot responses to the exercises. If 
many impils fail to nndci-Htand what to do, it is proliable that the iii- 
struetions are at fault. If this is not corrected the reliability of the 
test will certainly be lowered. The directions which accompany the 
sainjiles of objective tests presented later in this chajitcr arc examples 
of adequate statements. 

79. Rating Exercises as to Difficulty. 

After the questions have been developed and the directions and 
practice exercises perfected, the next stej) is to arrange the different 
groups of test exercises iii the approximate order of difficulty from 
easiest to must difficult. This can be done roughly through inspection 
and rearrangement of the exercises by the teacher. If several teachers 
pool their judgments of the rankings from casic.st to most difficult, the 
results will be more reliable. This arrangement of items in order of 
difficulty can be further refined after the test is given, by recording the 
number of pupils who respond correctly to the various items. Order of 
difficulty is quite important in a test because it saves the pupils’ 
time and secures from them a better psychological reaction. The pupil 
is given an opportunity to answer first the exercises that are easier for 
him and he is not so likely to use all the testing time on difficult items 
and fail to answer many that he docs know. The arrangement of items 
on the basis of difficulty probably increases the apparent reliability of 
the test. 

80. Rearranging Items on the Basis of Difficulty. 

The following true-false statements, taken from a longer test, are 
in the order in which they appeared when the test was first given. After 
the test was given and the pupils’ responses were analyzed, a better 
order of arrangement in the test was possible The numbers at the 
right indicate the order of increasing difficulty based on an analysis of 
the responses of 50 pupils. The exercise numbered “1" is the easiest 
item, i.e., was answered incorrectly by tlic smallest percentage of the 


class. 

Ohminal Obdeb Revised Obdeh 

1. Wipe moisture off of tools before putting them away. 6 

2. The marking gauge is used to make a line paiallel to an edge. 7 

3. Sandpapering is done to get a smooth surface for finislung. 1 

4. Stain may be applied with a cloth or a btush. 8 

5. To produce a good surface for fimsliing, sandpaper across the 

grain 9 

6. Varnish is thinned with alcohol. 10 

7. Good paint preserves the wood. 2 
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OniniNAi- OiiDiai Revihed Order 

8. Paint and vaiiiNh may be ujiplicd siitistactoiily on damp 

Hxii'faces. S 

9. Auger bits are graduated or numbered in sixteenths of an inch. 4 

10. To bore a clean hole wilh an auger bit, bore through until the 

.spur sbow.s, anil then finish boring from the othei side. 6 

It will be noted that only one of the easiest questions as indicated by 
an analysis of fiO jnipil responses is in the first five items as listed in 
the original test arrangement, 

81. Securing Reliability. 

The next essential in dcvelophiR an informal objective examination 
is to make it long enough to secure an acceptable reliability of meas- 
ureinent. Ileliability is obtained by sampling over a wide range of 
content and by slating a large number of valid questions in objective 
forms wbieb arc within the mental and educational range of the pupils 
to bo tested. Properly constructed objective tests of 75 to 100 or more 
exercises are usually highly reliable, whereas the ordinary six-, eight-, 
or ten-question essay-examination, with its limited sampling and sub¬ 
jective scoring, is alnioafc never sufficiently reliable. 

Samplmj as a Factor in Reliability. The brevity of the statement 
and the ease with which the response is recorded make it possible for 
the student to respond to many more objective exercises in a specified 
period than to those of the discussion types This makes it possible 
for the objective examination to cover a much wider area of subject- 
matter, or to cover a given area a great deal more intensively than is 
possible with the other type of exercise. The manner in which this 
factor of sampling operates to protect both the pupil and the teacher 
against the injustices of unreliable nicasurcment is shown very clearly 
in Fig 3, jiagc 35, and is discussed in detail on pages 34 and 35. 

tipcrific Hints on Semiring RcUuhility in an Examination .—The 
following suggestions have been found useful in securing high relia¬ 
bility in objective informal tests: 

1. Include from 50 to 100 items, each item being selected from 
definite units covering the entire area of the unit of the course. This 
step is closely related to securing liigh validity, but is considered here 
from the standpoint of reliability alone. 

2. Make the questions objective in type. This eliminates the vari¬ 
able factor of the teacher’s subjective judgment and gives assurance 
that all responses will be rated on the same basis. 

3. Eliminate the dead rveight from the test Do not include items 
which arc so easy that over 80 per cent of the class answer them 
correctly. Do not include items which are so difficult that less than 
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20 per cent of the class give the correct response. It is piobablc that 
test items wliich arc missed by 80 per cent or more, or arc missed by 
only 20 per cent or less of the class, do not differentiate pupil accom¬ 
plishment adequately. These items can be determined by short tests 
during the term before they arc put into the final test, or they can be 
eliminated after the test has been used once. 

4. Control the conditions for giving the test. Define specific direc¬ 
tions and conditions for administering the test. 

5. Provide a key with the correct respon.ses. It may be necessary 
to modify or give alternate answers on some completion cxerctses. 
The key, like otlier plia.scs of the test, can be refined best after the test 
has been given. 

82. Sample Objective Tests on Information Aspects of Elementary 
Woodwork. 

A sample of the results of following through the steps in the con¬ 
struction of an objceti^'c examination as outlined iii this chapter is 
shown in this section. The specimen is an experimental form of an 
objective test in woodworking which has been prepared and used by 
one of the authors in connection with his shop work This test is in 
four parts requiring a total testing tune of 42 minutes and having a 
possible total score of 94 points. Part I consists of 39 true-false items; 
Part II of four exercises in procedure-arrangement with a total point 
score of 22 points; Part III of 24 completion exercises; and Part IV 
of 9 multiple-respon.se items. The reliability of this test based on 100 
cases is .84. The total time requirements and the total possible score 
for each part are given in Table 26. 


TABLE 26 

Content oe OnjEci'UK Ex.amination 


Part 

Typo of 
Exeicise 

Time 

Allowantics 

Pos.sible 

Total 

Score 

I 

T-F 

15 

39 

II 

Pro.-Arr. 

S 

22 

III 

Compl. 

12 

24 

IV 

M-R 

7 

9 

Total 


42 

09 
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DinrmoNM TO Puitl 

Tlu.s IS iIimiIl'iI into ro\ir piirts. SpGOiflc diioulions are given for each 
liiU'L The iuiKMiiit of time iilloweil is indicated below the divections. You are 
to stoji working on eiieh jiart nhen tlie teacher calls lime. Do not begin the next 
part until the ti'iichcM' roads the ihroclions and gives thn signal to begin work 

Do not waste tiine on a question you know nothing about, skip it, and go 
on to the next one. Da not. ifiicas. 

You are now leady to study the diiections for Part I You are allowed IS 
iiuituli/i to complete. Pait I. Do not ask qiieal,ion.s about the test after you begin 
work. If you hroak yoiir pencil or need an eraser a.sk the teacher for one. 

P.uiT I. Tiiun-FALaG 

The followniK stideiiimts aie to bo answcied by diawing a circle around the 
capital letter T or F which follows the .slateiiicnt The lettiu T stands for true, 
and the letter F for Jahe, or untrue. 

If the slatciucnt is true, draw a circle aioiind the T, if false, diaw a ciicle 
anmml the F. The .«aiuplo e.xerci.si's mo answered corieetlv 


Sampk': Nails aic niado of wood. T 

Nails are made of metal. @ R 

Y'oii are allowed IS viinutrs for Part I. Tl'dit for Ike nigriaD 


Exorcises 

1. The surface of wood i.s planed to make it smooth. T 

2. Before putting away tools alwiiya wipe off any nioistuic tliat may 

he on them. T 

3. The marking gauge is used to make a line perpendicular to an 

edge. T 

4. Sandpapering on edges and surface.s is done in the direction of the 

grain. ip 

fi, Sandimpeiiug is done to get a smooth giirfiice suitable for finishing, T 

0. Sandpaper slumlrl bo wiappcd aiound a block when sanding flat sui- 

facc.s. T 

7. AYood i.s .‘ilaiiied in ouh'r to iinpiovc its appearance. T 

8. Stain may be apiilied with a elotli or a brush. T 

9. A well-made glue joint is weaker than any other part of the wood. T 

10, To ublain a good siirfiu'c for fini.sliing .'iiiudpapor across the grain, T 

11, No. 0 eaudpapci is coai.ser than No. 00. T 

12, Varni.sli is tliimied with alcohol. ■ T 

IS, A first-clas.s job of fini.'shing can he had with only one coat of var¬ 
nish if it is put on thick enough. T 

14. Varnish can bo smoothed down with fine, steel wool. T 

15. In rubbing down varnisli with powdered pumice stone, the pumice 

should be rubbed on dry. ip 

16. A drawing board is made of oak, or some other hard wood. T 

17. The visible lines of an object aro shown on a drawing by solid 

black lines. m 


F 


F 


F 


F 

F 


F 

F 

F 

F 

F 

F 

F 


F 

F 


P 

F 


F 
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18 The toe-squave is used for making hoiizoiilal lines. T F 

19. ThumbLacks should bo hamriiorod into iiliice on the drawing board. T F 

20. Templales are used to make, oi mark ouL, a shape on a boaid. T F 

21. The niortise of the mortisc-and-tenon joint is the reelangular hole 

into which the tenon fits T F 

22 Good paint preserves the wood. T F 

23. Paint and varnish may be applied satiafaetorily on damp surlaccs. T F 

24 New wood so.aka up much Imsced oil T F 

25. New wood should have a priming coat applied before the finish 

paint IS put on. T F 

26. Paint .and varnish are made from llie same matin ials. T F 

27. It is nut neccfsaiy to biush paint out well when applying it. T F 

28. If shellac is too thick, thin it out with turpentine. T F 

29. Shellac is a .slow-drying finish. T F 

30 Shellac makes a wateiproof finish. T F 

31. It IS easier to clean out a brush after varnishing with it if it is 

allowed to diy for 24 hours T F 

32 After paste wood filler has been applied, the surface must bo nibbed 

across the grain. T F 

33 Hand sciowa should be adjusted before any glue is applied to the 

pieces to be glued. T F 

34 iltindserows hold best if the jaws are parallel to each otlici. T F 

35. It IS easier to drive in a screw if the » row-driver has a round tip, 

than if it lias a square one. T F 

36. Augur bits are giaduntcd or miinbercd in thirty-seconds of an inch T F 

37. To boro a nice clean hole with an augur bit, bore through until the 

spur shows, and then flni.sh boring from the other side. T F 

38 A tee-bevel square is used when one wants to lay out an angle T F 

39. More acourato work can bo done if knife lines are used, rather than 

pencil lines. T F 

Stop and wait for the directions for Pm I 111 


P.\BT II. FnOCUnUHE-ABRlNGEMBNT 

On this page are several jobs, and the steps necessary for doing the job. 
However, the steps aie not placed in the correct oider. 

Deckle which step should be done first, and jilace the number of the step in 
the first parenthesis, then, the number of the .second step in the second paren¬ 
thesis, and so on until all the steps are down. The sample exercise is answered 
coirectly. 


Sample: To apply stain. 

1. Let stand for 2-3 minutes, 

2. Apply stain. 

3. Smooth the surface with sandpaper. 

4. Wipe off excess stain with cloth. 

5. Select a suitable stain. 

(3) (5) .. (2).. (1). .. (4) correct order. 

You are allowed S minutes for Part II, Wait for the sipnall 
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Ejccrciscs 

1. Litit the followinj; tunica of sandpaper according to coarseness, placing the 

finest grade first. 

1. No.O. 

2. No. 1 / 2 , 

3. No. 2. 

d. No. in . 

5 No. 00 . 

C. No 1. 

( ) ,.,( )..().(). (),.() 

2. To squaie up a board. 

1. Plane a surface true, mark it No. 1. 

2. Plane one edge square with No. 1, mark it No. 2. 

3. Gauge and plane to tliickno.ss, square willi edges and ends, mark No. 0. 

4. Cut to width and square other side with No 1 and 3, and mink No. 5. 

5. Plane 0710 end, and sqtiuie with No. 1 .and 2; m.ark No. 3 

C. Cut to length, square other end with No. 1 and 2; mark No. 4. 

()..()-().() ( ) ...( ) 

3 Apply paint on new wood. 

1. Apply first coat of finish iiamt. 

2. Shellac the knots. 

3. Apply second coat of finish paint. 

4. Clean off any grease or dirt with cloth wet in. benzine. 

6. Apply coat of priming paint. 

().(!..() ()...() 

4. To bore a hole with a brace and bit. 

1. Fasten bit in brace. 

2. Withdraw bit and finish boiing from opposite side. 

3. Mark location of hole. 

4. Bore through until spur shows on other side. 

5. Select proper size bit. 

( ) ( ) • ( )....( )....( ) 

Stop and wait for the directions for Part III! 

Part III. Completion Exebcises 

Each of the following statements has one or two words left out When the 
correct word or word.? arc inserted in the blank.s left for them, the sentences 
are specific and complete. The sample exercises are answered correctly. 


Samples ■ 

Nails are driven with a ^aaiawr 

Screws arc driven with a--— Here sciew-drlvcT is the eonect word 

Think what word completes the sentence and write it in the blank space loft 
for it. 

You are allowed 13 minutes for Part III. 
tFait Joy the signal! 
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Exercises 

1. A fmu-is used for whettmg a piano blade 

2. Sharpening chisels and piano irons on an oilstone removes the 


3 Tho thickness of a shaving is regulated by the_ 

4. The-holds the plane blade in place. 

5 When plane is not in use lay it on its_ 

6. In starting a shaving cut with a plane, press_upon the knob of the 

plane. 

7. In diiviiig nails, at first use-steady strokes. 

8. Inserting a wood block under the hammer when pulling nails prevents 

_the wood. 

9. The_should be used to guide the saw when starting a cut. 

10. Saw.*! work easier when rubbed with_ occasionally. 

11. An augur bit is inserted into the-of the biace. 

12. The number on. the tang or shank of a bit indicates its_ 

13. A-is used when boring a number of holes the same depth. 

14. The teeth of a coping saw blade should point_handle 

15. The cross-cut saw is used for cutting_the grain. 

IG. The tool used for setting nails below the surface is called a__ 

17. The cutting action of a ripsaw is like that of a number of_ 

18. The cutting action of a cross-cut saw is like a number of_ 

19 The ripsaw is used for cutting_the gram. 

20 A_cornered file is used to sharpen saws. 

21. Damp spongy wood requires a saw with plenty of_ 

22. The-works as a crank and holds the bit when boring. 

23. The size of an augur bit in - of an inch may be found on tho tang 

or shank. 

24. Tho cut made by a saw is called a_ 

Stop and wait jor the directions Jor Part IV! 


Part IV. Multiple-Choice Exercises 

Each of the slatemenhs below i.s answered correctly by one of the words fol¬ 
lowing the .sentence. 

Deteiminc which of the choice of words coircctly answers the .statement, and 
write tho number of that word on the line at the end of the exercise The sample 
exercise is answered coirectly. 


Sample: Sandpaper is made up of paper, glue, and_ 

(1) brick dust, (2) sand, (3) emery, (4) gravel 

Sand is the correct answer, and the number of the word (2) is written on the 
line. 

You are allowed 7 minutes for Part IV. ‘ 

Wail lor the signal! 
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Exercises 


1. Sheila I! IS thiiincd ivith- 

(1) benzine, (2) tiirpcntme, (3) ■water, (4) alcohol. — 

2. Oil sfain is mixed with - 

(1) alcohol, (2) turpentine, (3) water, (4) linseed oil, — 

3. Red cedar is the best wood for- 

(I) drosser.'j, (2) laiiips, (3) chests, (4) boolciaeks. — 

4. Jonitny is used in- 

(1) plastering, (2) plunihiiiR, (3) cabinet making, (4) bricklaying. — 

5. A good liquid to rub on tools to prevent riislmg is - 

(1) kerosene, (2) watei, (3) iniichinc oil, (4) turpentine. — 

6. A working drawing of an ohjoet show.s the- 

(1) corner view, (2) top view, (3) rear view, (4) bottom view. — 

7. Brushes that have been u.sed in shellac should be cleaned in- 

(1) oil, (2) turpentine, (3) alcohol, (4) gasoline. — 

S. Varnbli brushes .slioiild be cleaned m- 

(1) linseed oil, (2) shellac, (3) turpentine, (4) water. — 

9. Glued joints arc conimonly strengthened with- 

(1) dowels, (2) rivets, (3) wire, (4) brads. — 

End of the test 

The foregoing objective examination is designed to function as a 
papcr-and-pencil test for measuring informational aspects of instruc¬ 
tion m woodworking. Tests of this type can be developed by the in¬ 
dustrial education teacher who will follow the principles outlined in 
this volume. The objectives of the course of study must be definitely 
identified. The rest of the process is largely the mechanical formula¬ 
tion of the selected items in suitable objective form. Such tests of 
information in industrial education are valuable as pai’tial measures 
of achievement and teaching success, but alone, they are insufficient. 
They should be suiiplemented by performance tests. 


II. CONSTRUCTION OF OBJECTIVE PERFORMANCE TESTS 

The testing of performance is not new to industrial arts and 
industrial education. Manipulative trade tests were devised dur¬ 
ing the war and have been used with varying degrees of satis¬ 
faction in industry. The reliability of many of the early manipu¬ 
lative tests was low, and efforts to measure manipulative skill have 
not been as succes.sful as the measurement of infonnation by the use 
of the objective pencil-and-paper tests. A part of this difficulty has 
arisen from trying to apply pencil-ancl-paper techniques of test con¬ 
struction to manipulative-test construction without the necessary 
modifications in the administrative procedure. 
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83. Steps in Preparing Performance Tests. 

In evaluating the available performance tests and in constructing 
performance tests m the general shop the authors have found the 
following steps worthy of careful consideration: 

1. Analyze the course of study to determine exactly what qualities 
may be tested. 

2. Decide what tools and materials will be necessary. 

3. Prepare a number of test exorcises or make a composite exer¬ 
cise that will offer the pupil an opportunity to provide an adequate 
sample of his work with each tool or in.struincnt and type of material 
which it is desired to test. 

4. Make a statement of proecdiirc which tells the pupil exactly 
what to do in a vocabulary which is comprehensible at his grade 
level. 

5. Prepare a set of general directions for the jiupil before the test 
IS administered. 

6. Prepare directions for the examiner. 

7. Devise a method of scoring the test which provides an adequate 
measure of the results of each tool or mstrimicnt. 

8. Try out the test on a few students, and make the more obvious 
corrections. 

9. Make two or more forms of the test. 

10. Try out the test, and compute the reliability coefficient, stand¬ 
ard deviation, probable error, etc. 

For the purpose of illustrating the application of the principles of 
performance-test construction let us consider the measurement of the 
results of the following tool operations from a beginning woodworking 
course. 

1. Planing: side, end grain, 

2. Sawing: ripping, cross-cutting to a line. 

3. Boring: perpendicular to a surface. 

4. Squaring; a line around a block. 

5. Measuring; to Vs inch with try-square and rule. 

6. Gauge a line parallel to a surface. 

It should be kept clearly in mind that this is not a test of technique 
or speed but a test of quality or accuracy. The question is, how accu¬ 
rately can a pupil modify materials with these tools regardless of the 
method of handling them or the time required. 

The following tools are required: jack plane, try-square 6 inches, 
pencil, 24-inch folding rule, back saw, ripsaw, brace, Vi-inch bit. 
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bcncli with vise, bench liook, and marking gauge. The tools must be 
in first-class condition. 

The following nmterials arc required; first-quality white pine free 
of knots, 1 inch thick, surfaced on two sides, and ends sawed at an 
angle. It is most desirable to have the materials used in a test of this 
typo of uniform quality and the pieces of stock used by the pupils of 
about the same size and shajie. 

A. composite form of test exercise was selected for this test. 
Tig. 16 shows the working drawing for Form A, and Fig. 17 shows the 
working drawing for Form B. It will be ob.servcd that t]ie main dif¬ 
ferences between the test exercises arc in the dimensions; otherwise, 
there is the same opportnnity for modifying wood with common tools. 




Form "B" 

Fia. 17 


The validity of a performance test depends on providing samples of 
the pupil’s work and enough of the sample to give an adequate meas¬ 
ure of accuracy. Coiniiosite performance exercises are easier for the 
pupil to visualize if they represent some familiar object or toy, but 
it is seldom possible to do this without throwing the sampling of the 
various tool operations out of balance. 


Woodwiihkinc. Peupormancb Test op Accuracy 
Forms A Ann B 

Diicctians to Examiner; It i.s nviontial tlral the pupil kIirH understand the 
exact pioccdure, and that he be able to visualize how the block is to look when, 
finished. Tho following direetion.s are reooiuniended; 

1. Read aloud and distinctly the directions to the pupil while the clas.s follows 
silently. Ansiver any questions about the directions at this point. 

2. Show tho pupils a completed test block, and if they care to, let them 
examine it. 

3. When there are no further questions, say, "Get ready Hold up the test 
block. Begin work.” 

4. During tho examination answer any questions about the steps in the pro¬ 
cedure by rereading the step m question witli the pupil. 

5. Ob.servc the pupils as they work to make certain that they are doing all the 
steps in the correct order. 
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G. Make certain that tlip jnoper tool Js used where indicated, but do not 
tell the pupil how to use the tool. 

7. Help any pupil having difficulty in interpreting the working drawing, but 
do not make any measurements on the test block for the pupil This test 
measures ability to measure to Via m. with a ruler, but is not a measure of 
ability to read drawings. 

8 Take in the test block when the pupil has finished. The tune is not im¬ 
portant, foi this is a teat of quality or accuracy as it applies to modifying wood 
with simple hand tools. 

Diiectionn to Pupil. This is a test to determine how accurately you can use 
woodworking tools. The wood and all necessary tools will be given to you. 
The surfaces of the block of wood aie numbered 1, 2, 3, i, 5, G. You will be 
given specific dircction.s for doing the job and a woikmg drawmg that gives 
all the necessary dimensions. Do Ihia piojcct as accurately as you can. Do 
not waste time, but do not work too fast to do your best work The steps 
must be done in the order given. After you begin work do not ask unnecessary 
questions, but if you aic in doubt about a step in the piocodurc or a dimension 
on the woiking drawing ask the examiner. Write your name and giadc in school 
on surface No, G of the test block. Do not begin work until the e.taminer gives 
the signal. 


Procedui 0 


Woodworking PKnFonM.4.NCB Test—Form A 
Part I 


1. Select face No. 1. Plane it square and true and to the thickness indicated 
on the working drawing (Fig. 16). When finished re-mark No, 1. 

2. Select side No. 2. Plane it square and true to surface No. 1. When finished 
re-mark No. 2. 

3 Select end No. 3. Plano it square and true to No. 1 and 2. When finished 
re-mark No. 3. 

4. Measure from end No. 3 toward end No. 5, and square a sharp pencil line 
across the block to the length indicated in the working drawing. Saw off the 
waste material with a back saw so that the stock will be as nearly the requited 
length as you can make it. Do not i>Iane. Re-mark end No. 5 

5. From edge No. 2 gauge a line the length of the block, allowing the exact 
width as indicated on the working di awing. Rip as near the exact width as pos¬ 
sible. Do not plane. 

6. On surface No. 1 lay out the center for the hole and bore. 

7. When you have finished take your block to the examiner. 

84. Scoring Performance Tests. 

A try-square, a 1/4 -inch dowel 3 inches long, and a scale measuring 
in sixty-fourths are the tools used in scoring. The test is scored in 
units of sixty-fourths of an inch. If a measurement is more than i %4 
off, it is given a score of zero. If it is 1^4 off, it is given 9 points, % 4 , 
5 points, etc., as shown in Table 27. 


ProceduiGs for Forms A and B are identical although details are different. 
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This test has been used a number of times by the authors in shop 
mcasurenieiit. The correlation of Form A with Form B on 30 cases 
shows a reliability coefficient as high as .90. As has been pointed out, 
this test is not concei-ned with speed and tech¬ 
nique, but only with the accuracy or quality of 
work. It is not difficult to construct valid and 
reliable manipulative tests if the test worker 
keeps clearly in mind the different measurable 
factors in industrial education. However, it is 
well to remember that it requires skill to ad¬ 
minister a manipulative test. The examiner 
must follow the technique carefully. A fine tool 
in the hands of the unskilled worker will not 
necessarily produce an acceptable result. 

summary 

The outline of the curricular content of the 
unit affords the basis for validating the content 
of the test. With these objectives and the de¬ 
tails of the informational content of the course as a background, a 
pnper-and-iiencil test is easily formulated by using the types of test¬ 
ing techniques discussed in Chapter X. 

Knowledge of the informational aspects of courses in industrial 
education is only one of the important outcomes. Actual production 
in the shop or laboratory is quite another important outcome, which, 
within certain limits, is measurable. Quality in shop work may be 
rated by inspection, by actual measurement of dimensions, and by 
judgments based on quality scales. Inspection, if based on personal 
judgments unsupported by objective criteria of quality, is highly in¬ 
accurate. Actual measurement of the physical qualities (dimensions, 
etc.) is probably the most objective. However, there arc certain other 
qualities in shop products which are not mere matters of dimensions 
or accuracy of measurement or tool work. The evaluation of such 
qualities require the use of a rating scale. iSuch quality scales are 
not merely useful for the measurement of accomplishment, but they 
are particularly helpful in the development of an appreciation of 
quality on the part of the pupil. 

SUMMARY EXERCISES FOR DISCUSSION 

1. List anti illustrate the constructive steps in validating an objective informa¬ 

tion test in an industrial subject. 

2. Following the general plan presented in the section in this chapter entitled 


TARLE 27 


Limits of 
Tolei ance 

Point 

Scores 

0/61 

10 

1/61 

9 

2/61 

S 

3/61 

7 

4/61 

C 

S/G4 

5 

e/61 

4 

7/61 

3 

8/64 

2 

9/01 

1 
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Coum of Study Ouilm in Woodwoik, in'Gpiii-c a, similar list of speoific 
objectives for some other phase of industrial education. 

3 . ■\Vhal new siiggcsLions can 3 'oa add to the list of specific hints on securing 

reliability in an eMimination? 

4 . Prepaic at least ten objective test items of eadi of the four types illustrated 

oil pages 140 to 144 of this chapter, using any indiistiial education subject- 
matter except woodwoikiug. 

5. Prepare a peiforraance test in metal working, printing, or mechanical diawing 

following the steps outlined in this chapter. 
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CHAPTER XII 


CONSTRUCTION AND USE OF SCALES FOR RATING 
INDUSTRIAL EDUCATION PROJECTS 

1. PROJECT EATING SCALES 

86. Need for Scales for Rating Industrial Education Projects. 

The rating of shop projects and drawings is a difficult measurement 
problem wliich is faced by practically all industrial education teachers. 
The assignment of an objective rating to a shop product such as a 
chair, a radio, a table, a lamp, a funnel, a dustpan, a cement bird- 
bath, or an isometric drawing calls for the keenest of discrimination 
and for some method of objectifying standards of quality. Such 
products of the shop are composed of many different parts, which, 
combined, reflect quality in the object. For example, in the funnel, 
there are such factors as the forming, turning, wiring, seaming, and 
soldering. All these operations must be well done if the funnel is to 
have quality. Moreover, it must be made of the proper size and of 
suitable material. Thus size, shape, quality of material, suitability 
of material, and quality of workmanship as revealed in many small 
details must be recognized and evaluated by the judge. Each addi¬ 
tional characteristic sets up a number of possible variations in quality. 

The difficulty of grading or rating such products appears to be re¬ 
lated to the degree to which they vary from the typical. For example, 
shop and drawing projects which are well executed and those wliich 
are very inaccurate present a simpler marking problem than those 
which are made up of mixtures of good, bad, and indifferent qualities. 
This may be made clear by a concrete illustration. Let us assume 
that a number of blocks or dominoes are to be arranged in a straight 
line one inch apart with all faces parallel. If all the units are arranged 
correctly it is not difficult to see that this is true. In a like manner, 
if a shop project is very well done, it tends to radiate perfection. 
Likewise, a very poor project is not difficult to distinguish. However, 
the most difficult problem of measurement presents itself when part of 
the dominoes are set up correctly, a few of them are down, and all 
the rest are in varying degrees of alignment. The shop project which 
shows excellent design, a poor finish, square edges, weak joints, rough 
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surfaces, and carefully rounded corners is likewise more difficult to 
jrulge. If the design is given undue consideration, the xiroject will be 
rated too high. If only the joints and surfaces are considered, the 
pupil may fail on the project. 

86. Constructing a Project Rating Scale. 

The chief problem confronting shop and drawing teachers in their 
rating of projects is the definition of objective standards of quality. 
Industrial arts teachers have recognized the need for better methods of 
rating shop projects. Alany realize that the rating of shop and draw¬ 
ing problems is so subjective and unreliable that it is difficult to assign 
the proper rating to a pupil’s project. In general, the .suggestions for 
improvement in the reliability of rating shop projects indicate the 
desirability of combining the judgments on the different parts of the 
projects into a complete rating, and liuving the projects rated by three 
or more qualified judges. 

The authors have found the following principles helpful in con¬ 
structing a rating scale for shop and drawing projects: 

1. Make a careful analysis of the course of study fur the purpose of 
selecting the factors to put in the rating scale. In general, the items 
rated are tlie changes made in materials by tlie use of tools or instru¬ 
ments, fasteners, and finishes. 

Examylo (from cighLh-grade general woodworking imil); Utility, design, 
proportion, nailing, squareness, dimensions, screws, joints, glued joints, boring, 
sawed edges, planed edges, sanding, and finish. 

Example (from ninth-giude drawing); Ncatnc.ss, dinicnsious, arrowheads, 
lines, accuracy, French curves, placement, joints, lotteiing, and circles. 

2. Group the factors into classes according to method of rating to 
be used. 

Example; Woodwork. 

Inspection. 

Utility. 

Design. 

Pi'oportion. 

Finish. 

Physical measurement. 

Squareness 

Dimensions. 

Rating scale or inspection. 

Nailing. 

Screw joints. 

Glue joints. 
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Boring. 

Sawed edges. 

Planed edges. 

Sanding. 

Boring, 

Wood filing. 

Example: Drawing. 
Inspection. 

Neatness. 

Placement. 

Arrowheads. 

Physical measurement. 
Circle. 

Accuracy. 

Dimensions. 

Rating scale or inspection. 
Lettering. 

Lines. 

Numbering. 


3. Put the factors into a rating scale so that each part of the proj¬ 
ect can be given an individual objective rating and the ratings com¬ 
bined. 

Inspection by a critical and observant judge may reveal many de¬ 
fects in quality. Checks on the physical measurements of products 
by means of the rule and the square arc very objective. Pieces of ma¬ 
terial can be tested for squareness, thickness, width, and length. There 
are, however, numerous quality factors which are less tangible and are 
rated better with quality scales especially developed for the purpose. 
Splicing, lettering, soldering, and sawing are examples of such quali¬ 
ties. 

4. Prepare a set of directions for using the project rating scale. 

The project rating scale needs a carefully prepared set of direc¬ 
tions which explain in simple, direct language just how the scale is 
used. This should include the method of recording the ratings on the 
individual parts of the rating scale, the tools, and the quality scales 
needed, as well as a statement of the method of arriving at the com¬ 
plete score. A place should also be provided for recording the neces¬ 
sary information about the pupil and the project, as, for example, the 
student’s name, grade in school, date of finishing project, name of 
project, name or judge, and total score. 

6. Prepare a key for transforming the distance ratings into ob¬ 
jective values for use in computing the composite rating. This step 
is not necessary when the scale units are numbered on the scale. 
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87, Typical Project Rating Scales. 

On the following pages are excerpts from a project rating scale for 
mechanical drawing which the authors have found helpful in their 
classes and which may serve to illustrate the application of the prin¬ 
ciples discussed in this chapter^ 

I 

Rating Scale for Mechanical Drawings 

Pupil’s name - 

Drawing - 

School _ Date - 

Name of judge-Score- 

Directions: The information rcquiicd to rale the items is obtained by in¬ 
spection, quality scales, and physical measurement. Each item is rated on the 
biLsis of 10 points. The total rating of the drawing is the sum of all the item 
ratings which apply to the drawing. 

Mark the items in the order in which they appear. 

Draw a circle uiound tliu figure on the scale to indicate your rating of the 
item. The scale is divided into ten (10) equal parts. The right side (10) indi¬ 
cates the highest mark; the center, average, the left, the lowest A profile of 
the ratings may be made if desired by connecting the numbers repiesenting the 
ratings assigned to each item which applies to the specific project. 

Example; If you think Ulilily under I The Diawing should get a mark of 
8, then draw a circle around the figure 8, thus: 

1. The Drawing. 

1. Utility_ 1234567 ® 9 ID 

Instruments: Architect’s scale, compass. 

Quality Scales: Tlio judges should have quality scales for lettering figures and 
lines each with ten or more samples of known value 

I. The layout. 

1. The placing on paper_ 123456789 10 


la the object placed in such 

a manner 

as to 

permit a 

clear- 

cut drawing? 





2. Geometric construction 

1 2 3 

4 5 

6 7 8 

9 10 

How good is the instrument 

work? 




3. Trueness in meeting lines 

1 2 3 

4 6 

6 7 8 

9 10 


Do the lines meet truly and completely? 

4. Construction lines 

a. Accuracy_ 123456789 10 


By how many tliirty-scconds do the lines miss? 

b. Quality__ 123456789 10 

Are the lines clean and sharp cut? 

^ The latter part of this chapter is devoted to a discussion of a reliable method 
of constracting quality scales. 

“Inspection must be used when quality scales are not available. 
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II, Tlio fiiiislicd (IriuvinR (penL'il oi ink). 

1. Circle's. 

a. Noatncf.s_ 1234 fi 6785 10 

Are all hues I'lcali-ciit and imifonn? 

b. Trupncps_ 1234 5 6789 IQ 

Arc the circles Lmly diawn? 

c. Acciiiacy 123456789 10 

Docs the dinmotcr vary Iiom the tiiio dimension more than 
ifjn inch? 

d. Tangrncy 123456789 10 

Du hinReiit lines fit, in jierfoctly? 

2. Arcs and enrvea. 

a. Neatness_ 1234 S07S9 10 

Aig tliG line,s clean-ciit and uniform? 

b. Tnicui'ss_ 123 4 56789 10 

Are the curves and iiics truly duiwn? 

e. Accuiacy_ 1234567S9 10 

Do hues vary more than (.{sa inch fiom given dimon,sion? 

d, Tangencj;_ 123456789 10 

Do tangent lino.s fit m perfectly? 

c. Comiilctenesb_ 1234 5 6780 10 

Do tho lines run up completely’ 

3 Hoiizontal lines 


a. 

Neatness 

1 2 3 

4 

5 

6 

7 

S 

0 

10 


Are 

the lines clean-cut and uniform''’ 








b 

Accuracy 

12 3 

4 

5 

6 

7 

8 

9 

10 

c. 

Do 

Tiuenc.ss 

lines vniy more than inch from 

12 3 4 

given 

5 6 

dimension? 
7 8 9 10 


Al(‘ 

all lino.s straight and horizontal? 








d. 

Conipletenohs 

1 2 3 

4 

5 

6 

7 

8 

9 

10 


Do the linos run short? 








Veitical lines. 









a. 

N eatness 

12 3 

4 

5 

6 

7 

8 

9 

10 


Are 

all hiip.s clcan-ciit and uniform? 








b 

Accuracy 

1 2 3 

4 

5 

6 

7 

8 

9 

10 


Do lines vary nioic than inch from 

given 

dimension? 

c. 

1 ruenoys 

1 2 3 

4 

5 

6 

7 

8 

9 

10 

d. 

Are 

Completeuchs 

all lines straight and vertical? 

12 3 

4 

5 

6 

7 

8 

9 

10 


Do lines run up completely? 

5, Skew lines, 

a. Neatness_1 2 3 4 5 6 7 g 9 10 


b. Accuracy 

Are the lines cican-eiit and uniform? 

1 2 3 

4 5 

6 

7 

8 

0 

10 


Do the climenaions vary more than 
(hincnaions? 

%2 

inch 

from 

given 

c. Trueness 

1 2 3 

4 5 

6 

7 

8 

9 

10 


Are the lines straight and to the points? 
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d. 

Conipletencs.s 

1 

2 3 

4 5 

G 

7 

8 

9 

10 



Do lines lun up eoiuplelely ? 









6. 

Dimension luic.s 










a. 

Placing 

1 

2 3 

4 5 

6 

7 

8 

9 

10 



Are they ))laeed eorreetly and 

conspicuously? 






b. 

Quantity 

1 

2 3 

4 5 

6 

7 

S 

9 

10 



Arn tliG necGssary lines in? 










e. 

Quality 

1 

2 3 

4 5 

6 

7 

8 

9 

10 



Aie they the tiue lines as to 

accuracy? 







d. 

Spacing 

1 

2 3 

4 5 

6 

7 

8 

9 

10 



Aie they sjiaced too close or 

too much? 







P. 

CoUL'CtllOS.« 

1 

2 3 

4 5 

6 

7 

8 

9 

10 



Are they eoricetly imf, in? 










I. 

Arrowheads. 











(1) Quality 

1 

2 3 

4 5 

G 

7 

8 

9 

10 



Are they neat and trim? 











(2) Aeuiiraey 

1 

2 3 

4 5 

6 

7 

8 

9 

10 



Do they eoiiio up to the extension lines? 







K- 

Extension Imoa 

1 

2 3 

4 6 

6 

7 

8 

9 

10 



Do they lun into the object? 








7. 

Dimensions. 










il. 

Legibility 

1 

2 3 

4 5 

6 

7 

8 

9 

10 



Can they bo read 1 










b. 

Corroc'tne.s.s 

1 

2 3 

4 5 

6 

7 

8 

9 

10 



Are they corre<‘tly jnit m'’’ 









8. 

Notes. 










a 

Legibility 

1 

2 3 

4 5 

6 

7 

8 

9 

10 



Can they he lead? 










b. 

Clearness 

1 

2 3 

4 5 

6 

7 

8 

9 

10 


Do tlu'y state exact,ly wlmt is wanted? 


c. Lettermg 

(1) Uinforniity_ 123456789 10 

Are the letters iiuifuriii as to height and slant? 

(2) Aniieainnce_ 123456789 10 

How are the letters formed? 

(3) I'’irnin(;s.s_ 1234 5 6789 10 

Are the strokes firm ? 

III. Summing up the drawing 

1. Utility 123456789 10 

Can the di awing be used? 

2- Value^___ 123456789 10 

I.S the (Iniwing of any aid in making the pioject drawn? 

3. Aijpcaratice__ 123456789 10 

Is Ihc drawing exeeuted in a neat, profe.ssional manner? 

4. Completeness__ 123456789 10 

Is the drawing eomplete? 
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88. Using Project Rating Scales. 

Not only ilocs the project rating scale afford a more objective means 
of rating projects from the shojr and drawing-room, but it is itself a 
valuable teaching device. It aids the pupil in developing a proper 
appreciation of quality in workmanship. Its use by students m the 
rating of their own projects and those of their classmates gives them 
valuable opportunities for developing habits of careful analysis and 
experience in judging quality in workmanship. 

The best teaching results are obtained with the project rating scale 
wlien it is used during the time the project is being made. The items 
in the project scale should be arranged in an order which facilitates 
the use of the project rating scale parallel with the development of the 
project itself. The finish, utility, design, and proportion of a project 
are better judged after the project is completed Tlie results of saw¬ 
ing, gluing, squaring, screwing, turning, forming, soldering, riveting, 
splicing, etc., are easier to judge just after they are completed and 
before they have been obscured or modified by other tool operations 
or parts of the project. Sawed or planed edges are often modified by 
filing, scraping, and sanding. The cpiality of these operations must be 
rated at a time when they give a true picture of the pupil’s proficiency. 

The general quality of a project is determined by the sum total 
of the operations whicli go into its making. It is well for a pupil to 
realize this, and to use the diagnostic value of the rating scales to 
check the results of the tool operations. If a pupil is unable to set a 
rivet, make a splice, or bore a hole in an acceptable manner, both the 
pupil and the instructor should be aware of the fact. The pupil may 
need remedial work, and even though he may never be able to achieve 
a high standard of workmanship he can and should develop an ap¬ 
preciation of what constitutes quality in workmanship. One generally 
accepted method of developing an appreciation of quality is to allow 
the pupil to modify materials and to judge the results of his own 
cfioi'ts and those of others in relation to accejitable standards. 

Teachers of industrial education need training and practice in the 
construction and use of project rating scales. This is a function that 
should be taken over by the teacher-training institutions and super¬ 
visory officers. In the event that the proper training is not available, 
the progressive industrial education teacher should make project rat¬ 
ing scales which are based on his course of study and practice their 
use. In any case a carefully prepared project rating scale reduces 
the variation of marks and provides a valuable teaching device 
which cannot be ignored by those who wish to do superior work 
in their shops and drafting-rooms. A carefully prepared project rat- 
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ing scale is more objective than, the usual mark assigned by the 
teacher, but it 3S distinctly more meaningful when the ratings of from 
five to ten judges are pooled. When the pooled judgments of such 
a number of trained observers are used, reliability coefficients of .90 
or better are obtained. This is a satisfactory reliability of measure¬ 
ment and produces a rating comparable to the best objective ratings 
in other fields of instruction. 

11. QUALITY SCALES FOE SHOP PRODUCTS 
89. Constructing Quality Scales. 

Quality scales arc useful measurement and teaching instruments 
which the shop teacher can easily develop and use in his course. Such 
scales are a very useful means of evaluating certain types of work 
done in the class. If made available for inspection and compari¬ 
son, they serve to set up standards for the students themselves to 
attain, thus aiding the pupil in developing an appreciation of quality. 
The pupils may rate their own and other pupils’ work, thus gaining 
real experience in rating and further developing a concept of what 
constitutes real quality in workmanship. 

The teacher of woodworking will find that quality scales dealing 
with such skills as sawing, boring-exit edge, fastening with screws, 
gluing, planing, sanding, and nailing are very helpful instruments to 
use in the more exact evaluation of shop products. The specimens 
constituting the scales can be mounted on suitable panels to be con¬ 
veniently available for inspection and use. The teacher of sheet- 
metal work needs quality scales showing the range of workmanship in 
riveting, soldered lap-scam work, wiring, lockcd-seam work, and turn¬ 
ing. This list may be reduced or expanded depending on the type of 
course being taught. 

Quality scales have been widely used in rating such school products 
as writing, lettering, and drawing. Scales of this type can be repro¬ 
duced and widely distributed, but quality scales dealing with pieces of 
material are not so easily duplicated. Quality scales for shop use 
should be made up of actual specimens of the products tliemselves. 
They should be available to the students so that the specimens may be 
seen and handled. Such scales may be photographed, but in. use the 
picture lacks the satisfaction resulting from a scale composed of actual 
samples of the work to be rated. 

Because of the real value which quality scales have in the measure¬ 
ment of industrial education and in the development of the pupil’s 
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(3Dii('L'ph of (luuljtyj moiluHl for developing such scales is presented 
in this cdiiipter. 

90. Steps in Making a Quality Rating Scale, 

The following steps are essential in developing a quality scale fur 
shop products: 

1. SeciU'c samples of the type of product to be used in the scale. 

2. Arrange the samples in order of merit as determined by the pidg- 
inent of ten or more competent judges. 

3. Dctcrininc the percentages of judges rating a given sample as 
betier than each other sample. 

4. Arrange the specimens in order from best to poorest on the basis 
of these camjiositc judgments. 

5. Find tlic deviation of the percentages from the median (50 per 
cent), retaining the sign of the deviation. 

6 Calculate the scale differences between the successive samples. 

7. Assume a zero point, and place all samples along the linear scale 
of values, 

8. Select eight to twelve of the samples which nearest approach 
uniform differences in quality for the quality scale, assign the proper 
quality values to them, and mount them for use in the shop. 

Each of these essential steps in constructing a quality scale is dis¬ 
cussed and illustrated in succeeding sections of this chapter. A quality 
scale showing degrees of merit in soldering a lap seam is used for 
illustrative purposes. Fig. 19 shows the values assigned in the com¬ 
pleted scale. 

91. Securing Samples, 

Twenty to forty representative samples ranging widely in quality 
will usually he adequate for the construction of a satisfactory quality 
rating scale for use in an industrial education course. In scouring 
samples for such a scale, it is essential that some of the samples be 
superior, some poor, and some average in quality. Samples of average 
quality are easy to secure, but it may bo more difficult to get a proper 
number of excellent and poor ones. If twenty to forty samples have 
been secured and, after a superficial inspection, it is obvious that there 
is a shortage of poor and excellent ones, the instructor should make a 
few samples which are excellent and a few which are very poor and 
add them to the list. This practice is defensible, as it might be neces¬ 
sary to sample very ividely in order to secure enough specimens show¬ 
ing the required quality range. The use of many more than forty 
samples makes it more difficult for the judges to rate them carefully 
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find also adds considerably to the amount of calculation necessary in 
the derivation of the scale. 

Twenty-two samples were used in construction of the scale in Fig. 
19. It shows the quality of soldering on a lap seam. No samples 
rvei’G added to this group. The students who made the samples varied 
widely in their ability and background. 

92. Securing Independent Estimates of Relative Quality of the 
Samples. 

The saiiiplc.s to be used in making the scale should be lettered or 
numbered to make them readily identifiable. Names of pupils who 
made them should not be attacbed, since they might influence the 
judges’ rating After the samples arc properly labeled, they should 
be given to the judges individually m random order. The judges are 
instructed to arrange them from poorest to best by comparison. The 
judges may be shop teachers or tradesmen, but they slionkl be per¬ 
sons competent to rate the quality of the samples. The number of 
judges should not bo less than nine or ten. A larger number would 
be better. If ten (or multiples of ten) judges can be secured, the 
calculation of percentages is simplified. It might be permissible for 
an isolated teacher who is unable to .secure the cooperation of that 
number of qualified judges to have the same judge rate the samples 
more than once. If this procedure is followed, it is advisable to allow 
a day or two between ratings to reduce the influence of memory in 
placing the samples. 

Table 28 gives the results obtained when nine judges ranked the 
twenty-two soldered lap joints used in making the scale illustrated 
here. Each of the twenty-two samples is designated by a letter. 

TABLE 2S 

Rankipto by Nine Judge.s 

IIkjii Lot? 



1 

2 

3 

4 

5 

0 

7 

8 

f) 

10 

11 

12 

13 

14 

15 

10 

17 

18 

10 

20 

21 

22 

Judge 1 

I 

1’ 

u 

R 

B 

A 

G 

N 

E 

s 

T. 

G 

Q 

V 

G 

J 

F 

D 

TC 

M 

H 

T 

Judge 2 

B 

I 

s 

R 

u 

P 

G 

Q 

A 

L 

E 

N 

c 

J 

M 

V 

K 

0 

II 

D 

F 

T 

Judge 3 

P 

B 

A 

N 

I 

u 

S 

J 

Q 

L 

V 

E 

R 

G 

K 

c 

0 

M 

H 

F 

D 

T 

Judge 4 

B 

P 

Q 

E 


R 

u 

G 

C 

S 

N 

A 

L 

J 

H 

M 

u 
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E 

T 

F 

Judge 5 

D 

a 

P 

G 

Q 

s 

u 

I 

A 

J 

E 

v 

L 

K 

M 

O 

F 

H 

N 

D 

C 

T 

Judge 0 

B 


Q 

I 

R 

A 

G 

E 

U 

S 

V 

J 

u 

N 

0 

li: 

C 

M 

F 

D 

H 

T 

Judge 7 

B 

I 

R 

c 

A 

p 

u 

G 

a 

II 

J 

Q 

E 

v 

L 

M 

K 

F 

N 

0 

D 

T 

Judge 0 


I 

Q 


U 

P 

li 

J 

c 

G 

A 

N 

V 

M 

0 

E 

K 

F 

L 

D 

H 

T 

Judge 9 
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I 

J 

C 
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M 

u 

P 

n 
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V 
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93. Determining Percentage Ratings of Judges. 

Tile ratings of tlie twenty-two specimens by the nine judges are 
tabulated in Table 28. The next step is the determination of the per¬ 
centage of judges rating a given sample as better than every other 
samiile. Table 29 gives these data for the nine judges and twenty-two 
samples used in developing the quality scale for soldering lap joints. 
It will be noted that the samples arc rated in alphabetical order. This 
table is to be read as follows. A is better than A no times because they 
arc the same sample, but B is rated better than A nine times which 
means all the judges considered it better than A, etc. It will be noted 
that C is rated better than A four times, better than B zero times, 
etc. This talile, showing the number of judges rating each sample in 
relation to every other sample, is the basis for the next step in con¬ 
structing the scale. 

The next step is to change the ratings to percentages. Table 30 
shows the ratings in Tabic 29 after they have been changed to per¬ 
centages. Specimen B is rated better than A by nine judges, or 100 
per cent. Specimen C is rated as bettor than A by four judges, or 
44:4 per cent. This procedure when completed gives the results as re¬ 
ported in Table 30. 

94. Determining Order of Merit of Samples. 

The rank order of the samples is determined by referring to the 
sum of the ratings for each sample m Table 29. The sample with the 
highest total rating is highest in quality, the one with the next highest 
is second, and so on. This gives useful information for construction 
of a quality scale, since it indicates the relative order as given by the 
combined judgment of nine judges. However, it does not tell the 
exact distance between each sample along a known scale. It is neces¬ 
sary to know the relation of the respective samples to each other be¬ 
fore a number of samples can be selected to represent approximately 
equal distances on the scale. 

95. Determining the Scale Differences. 

The first step is to find the deviation of the percentages from the 
median (50 per cent). This is accomplished by subtracting from the 
median (50 per cent) the percentage values for each rating as given 
in Table 30. These percentages are then expressed as positive or nega¬ 
tive deviations from the median (Table 31). All cases of 100 per cent 
and zero per cent are omitted from the table since they do not operate 
to affect the results. After the technique of developing quality scales 
is mastered, the student will find it convenient to omit the preparation 
of Table 31. 
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CONSTRUCTION AND USE OF SCALES 


The percentage values which arc given in Table 31 are next referred 
to any table showing the fractional parts of the total area of the 
normal probability curve corresponding to designated distances on the 
base line. Such a table as Table X on page 91 of Garrett’s Statistics 
in Psychology and Education (Longmans, Green and Company, 1927) 
or Table 51 on page 219 of Thorndike’s Mental and Social Measure¬ 
ments (Teachers College, Columbia University, 1916) affords the 
necessary data. These tables show, for example, that Specimen C, 
which was rated as better than Sample A by 44.4 per cent of the 
judges (Table 30) lies below the quality of Sample A a distance of 
5.6 per cent (Table 31), and is actually below the median of the 
nonnal distribution of such qualities a distance of —0.14 standard 
deviation unit. Specimen E was rated as better than A only 11.1 per 
cent of the time. It therefore is 38.0 per cent below the median quality 
represented by A. Table X in Garrett, or Tabic 51 in Thorndike, 
indicates that a specimen of such quality lies 1.22 sigma units below 
the quality of Specimen A. All the sigma-unit values given in Table 
32 were obtained from these tables in a similar manner. 

Table 32 gives the standard deviation distance between each sample 
and all other specimens not at the median or at the extreme end of the 
scale. These data now make it jiossible to compute the scale difference 
of the samples. The formula ’ for obtaining the scale differences is 


as follows: 


S 


Sff — 


\/2 .rlfc—.r2fc 

A 


in whieli .S,,rCquals the scale separation 


of the samples in sigma units, .rlfc — a:2k equals the sum of the sigma- 
unit differences for the two specimens, and N is the number of such 
diffcronccs. 

Emmith': DiffcrcrK'p.s of Samplc.s F and H. They lank third and JourLh in 
the sorius of twenty-two .'nuoplo.s of .solderinR, (Sec Table 29.) 


F 

H 

Sigina-Unit 

Differences 

— 122 

— 1.22 

0 

-1-0.76 

4-014 

0.62 

— 122 

— 0 76 

0.46 

— 122 

— 1.22 

0 

— 1.22 

4-0 76 

1 98 

— 0 76 

— 0.76 

0 

— 122 

— 0 76 

0.46 

-1-122 

4-122 

0 

„ , VS 352 1.414.352 4.9772 „ „„ 

Sciiki tl ITeronceri = - =- - -- — 5 — - 0.62. 

o o 

‘1 This formula and thn proof for same are given by Thurstone ia Journal 

.neial Vsychuluav, Vol 1 :405-423, 1928. 




TABLE 32 

SiQUA Differences Expressed as Deviations from the Median (50 Per Cent) 
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(Fro/m Table 51 in Thorndike’s Mental and Social Measurernents) 
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CONSTRUCTION AND USE OF SCALES 


Tlic applipiRion of the same method to the other samples results 
in the scale differences shown in Table 33. This table gives the rank 
order according to the nmc judges and the scale differences between 
each sample on a comparable basis. It is known that the distance 
from T to D is 1,09 sigma units, and on the same scale the difference 
between D and F is, 0.16 unit. 

TABLE 33 

SoAi.e DiPFiniENCEs Between Specimens 


Samples 

Difforence 
in Units 

T-D 

1.09 

D-F 

0.16 

F-n 

0.G2 

II-K 

0,87 

K-0 

0,25 

0-M 

0,57 

M-V 

0 49 

V-L 

0,31 

L-N 

0,33 

N-C 

0 57 

C-E 

0 48 

E-J 

0 41 

J-G 

0,98 

G-S 

0 29 

S-Q 

0 51 

Q-U 

0 66 

U-A 

0.64 

A-ll 

0 56 

R-r 

0.59 

p-i 

0 88 

l-B 

0.29 


96. Establishing a Point of Origin for the Scale. 

The next problem is to establish a zero point. It is permissible in 
a quality scale of this type to assume a zero point. According to the 
rating of the nine judges, sample T was rated the poorest of all the 
siimiiles. The ciuestion then arises: Is sample T the jioorest conceiv¬ 
able soldered lap joint? The answer is that it is very poor, but could 
be worse and still hold together. Therefore, for the purposes of this 
■><^16, Specimen T is assiinierl arbitrarily to have a value of €.9 unit 
above zero. There probably could not be a soldered lap joint of zero 
quality since it would not hold together at all and could not be con¬ 
sidered a soldered joint. In ap.snmmg a zero point for a quality scale 
in industrial education, it seems advisable to select a point from 0.8 
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to 1 above zero. After all, such a zero point or point of origin for 
the scale is arbitrarily .set up, whatever the method used. 

With the zero point assumed as 0.9 unit, the twenty-two samples 
are then ranked along the scale by adding the scale differences to the 
assumed zero according to the relative rankings on the scale. Table 
33 shows the sigma-unit differences in cjuality of each adjoining pair 
of the twenty-two samples. Table 34 shows the scale values assigned 
to each specimen. Sample T, being the specimen of poorest quality, is 
given a scale value of 0.9, on the assumption that it has a quality 
value of approximately 0,9 unit above the point designated as the 
arbitrary zero point of the scale. Sample D is 1.09 sigma units better 
than Sample T, therefore the scale value assigned to Sample D is 
1.09 + 0.90 = 1.99 units above zero quality. Each exercise in ascend¬ 
ing order of merit is assigned a scale value corresponding to its merit 
in relation to the next poorer specimen. The value,? in the parenthoscs 
in Table 34 are sigma units of difference between the pairs of speci¬ 
mens. The ascending values arc the scale units of value or merit 
assigned to each of the twenty-two specimens comprising the scale. 
A graphic presentation of the relationship of these specimens to the 
scale and to each other is given in Fig. 18. 

After the samples arc ranked and scaled, the final step is to select 
samples suitable for use in the quality scale. In doing this, the au- 

T DF H KO M VLN CEJ GSQUAR P IB 

I II I I I I I I I i _I_I_U_1_I_I_I_ I I I 

0 1 2 3 4 5 6 7 6 9 10 11 IZ 13 

Eio. 18. —SpecimEDS Assigned to a Linear Scale (Zero a.ssumed to be .9 step 

below samiilo “T”). 

thoi's have found that from eiglit to twelve samples make a satisfac¬ 
tory scale for checking quality in industrial education. The first ob¬ 
ject is to select samples for the scale ivhosc quality values represent 
aiiproximatcly equal distances along the scale. This gives the samples 
a definite rating. The scale can then be used as a measuring instru¬ 
ment for rating quality. Fig. 19 shows a scale with eight samples and 
another with thirteen samples. Both of these scales are taken from 

Scale 1 

TDHKMLCJGO a P B 

I _ I I I I I I _ I I I_ I _ I _ I _ 

9 2,0 2,8 3.6 44 53 61 70 80 89 101 11,3 12.4 

Scale 2 

TF OL J Q R B 

_I_I_I_I_I_I_I_ I 

,9 21 3 9 5.3 7 0 B9 10 7 12.4 

Fio. 19,—Two Quality Scales Scleeted from the 22 Specimens. 
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the combined data in Fig 18 (Table 34). It frequently occurs that 
several sam])leB will fall very close together on the scale, as D and F 
in Pig. 18. This means that two samples of approximately equal 
quality are available for that point on the scale. This also explains 
the unnecessary computation involved wlien a large number of samples 


TABLE 34 
Scale Values 

Zero assumed to bo 0.9 step below sample T. 


T 

0,90 

E 

(0 48) 



(109) 


6.64 

S 

D 

1.99 S 

J 

(0 41) 





7.05 

s 

F 

(0.16) 

G 

(0 98) 



2.15 S 


8.03 

s 

H 

(0 62) 

S 

(0.29) 



2.77 S 


8.32 

s 

K 

(0S7) 

Q 

(0.51) 



3,64 S 


8.S3 

s 

0 

(0,25) 

U 

(0 66) 



3 89 S 


9.49 

s 

M 

(0,57) 

A 

(0.64) 



4,40 S 


10 13 

s 

V 

(0.49) 

R 

(0 66) 



4.95 S 


10.69 

s 

L 

(0.31) 

P 

(0.59) 



5.26 S 


1128 

s 

N 

(0 33) 

I 

(0.88) 



5.59 S 


12 16 

s 

C 

(0,57) 

B 

(0.29) 



6.16 S 


12,45 

s 


of approximately the same difliculty are used, because several samples 
fall at about the same place on the scale and only one or two are 
needed for that point to make the quality scale. 

Pig. 20 shows photographic reproductions of scales of quality for 
end-splices, underwriter’s knots, solder joints, dados, and flat- and 
round-head screws. 

97. Reliability of Ratings on Quality Scales. 

The question may now be asked, are quality scales reliable measur¬ 
ing instruments? Both the reliability of the pooled judgments of ex¬ 
perts and the reliability of ratings on quality scales have been shown 
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Fra. 20. 


to be high by Paterson, Elliott, Anderson, and others ^ in developing a 
criterion of mechanical ability. These workers found the pooled judg¬ 
ments of ten qualified judges to have a reliability of .90 or better. 
They also found the reliability of quality scales to be high when the 
ratings of two or more judges are averaged. 

'‘Paterson, Elliott, Anderson, and others, Minnesota Mechanical Ability Tests, 
The University of Minnesota Press, Minneapolis, Minnesota, pp. 194-202. 
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The authors have found that the reliability of rating of shop 
products by means of quality scales is much higher than that normally 
obtained under subjective conditions. However, it is advisable to 
average three or four independent ratings if a reliability of .90 or 
better is desired. These ratings may be made by three or four quali¬ 
fied teachers or by the same teacher at widely separated intervals. 

SUMMARY 

The rating of shop and drawing products involves problems which 
confront practically all industrial education teachers. The subjective 
ratings of shop and drawing teachers are as unreliable as teachers’ 
marks in other subject-matter fields. The rating of a shop project 
or a drawing requires the making of a complex judgment involving 
many variables. The reliability of industrial education teachers’ 
marks can be improved by using a project rating scale. 

Shop and drawing projects are rated by inspection, physical 
measurement, and quality scales. A project rating scale lists the vari¬ 
able characteristics of the project upon which the judgment is based 
so that they may he considered individually, and the total rating is the 
sum of the individual ratings. The project I'ating scale is also a 
valuable teaching device for developing appreciation of those factors 
which produce quality in workinaiisliip and in diagnosing individual 
pupil difficulties. 

The development of a quality scale involves the collection of rep¬ 
resentative speeiincns of varying merit, the arrangement of these 
samples in order of merit by means of the use of pooled judgments, 
the calculation of the differences in quality of the specimens in terms 
of sigma units, the establishment of a zero- point, and the arrangement 
and evaluation of the specimens along the scale in relation to one an¬ 
other and to the point of origin of the scale. The technique of pooled 
judgments is probably sufficiently reliable for most purposes when as 
many as ten or more qualified and critical judges are used. Quality 
scales are reasonably reliable measuring instruments, but reach their 
highest efficiency when the ratings of three or more judges arc 
averaged. 

SUMMARY EXERCISES FOR DISCUSSION 

1. Point out the distingiii.shing features of pioject rating scales and quality rating 
scales for shop products 

2 Make a project rating scale for some industrial art field other than mechan¬ 
ical drawing. The niechanicul drawing rating scale on pages 153 to 155 
may be followed a.s an example. 

3. Recapitulate the stops in prepanng a quality rating scale. 
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4. In fif'diii'ing specimens for the constniction of a fiiiiilily rating scale, wliy is 
li essential tliat a wide range of quality bo sampled? 

1). IVliat iiiiportiince do you sec in tlie estnblislimciit of the zero point of a 
quality scale? 
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CHAPTER XIII 


RATING AND DEVELOPING PERSONALITY AND 
CHARACTER TRAITS 

98, Importance of Character and Personality Traits. 

The very commonplaceness of the terms character and personality 
may account for the fact that they have been given no very exact defi¬ 
nition. In a vague way we know what character is, and in a similarly 
vague way we speak of personality, yet probably no other single out¬ 
come of our social or educational programs is so important as the 
development of proper character and personality traits. Every 
serious-minded teacher, parent, or pupil realizes the importance of a 
good character and a pleasing personality But w'hat is character? 
What is personality? Can tliey be developed, or are they inherent? 
Are we mere slaves of heredity doomed to act and to think in the same 
way in similar environmental conditions, or is our destiny to a certain 
extent in our own control? 

Character has been defined in a discussion of psychology for 
teachers as “the sum total of his [an individual’s] behavior in rela¬ 
tion to the world about him and to his fellow beings."^ Thus, 
character is taken to be synonymous with the behavior aspects of per¬ 
sonality. Character is revealed by an individual's responses to stim¬ 
ulation. If the responses are those commonly considered pleasing or 
acceptable to his fellows, the individual is described as having a 
"good” character. If the reverse, he has a “weak” character, 

Lay usage lias tended to attach to personality the notion of in¬ 
dividuality or distinctiveness in character. In many respects this has 
had an unfortunate effect, for in their efforts to achieve distinctive¬ 
ness many individuals have revealed phases of character which are 
far from pleasing. Individuality is undoubtedly important but not so 
important that it should he achieved at the expense of honor, honesty, 

1 Benson, C. E.; Lough, J. E.; Skinner, C. E,; West, P. V., Psychology Jot 
Teachers, Ginn and Company, Boston, 192C. 
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morality, bravery, modesty, or otlier attributes of character, wliich, in 
the past at least, have been considered desirable. 

Much damage ha.s been done by self-.stylcd “psychologists” who 
have filled the literature of today with high-sounding suggestions by 
which an individual is to find his latent powers and suddenly develop 
a strong and forceful personality. Valentine “ states that a “little 
exaltation will hurt no one. It is a healthful sort of feeling, but is no 
substitute for intelligence, vocational ability, moral habits, Icader- 
.ship, culture, social habits, or any other desirable quality which in¬ 
heritance and creathm exiioricnee alone can supply.” The school, the 
shop, and the home face n most diflleult and critical problem in pro¬ 
viding the right kind of stimuli for the development of desirable char¬ 
acter and personality traits. 

Personality can be modified and improved through diligent effort 
over a period of time, but it is not a simple task which can be accom¬ 
plished in a few weeks or months. Personality needs always to be 
modified and developed in tlie light of changing social and vocational 
conditions. Obviously, since our personalities have resulted from the 
modification of our original natures by our environment, we can con¬ 
sciously develop more desirable modes of conduct by selecting the 
tjqie of adjustments and then developing habits accordingly. It is 
true that too often personality emerges witli little conscious knowl¬ 
edge on the part of the pupil of what the desirable traits are which 
have been found essential to success in life. However, the pupil and 
teacher could at least in part direct the development of personality if 
they have in mind an ideal toward which to work. 

Jones ^ states that “In general, personality is the same as indi¬ 
viduality: it is that group of qualities and characteristics that makes 
one an individual, that set him off from other individuals. As such, 
it is the sum total of abilities, skills, interests, and physical and 
mental characteristics that he possesses, or better still the combina¬ 
tion of all of these.” 

As an illustration, let us consider the modern automobile. Its 
present state of development is the partial realization of an ideal of 
transportation which has been taking shape during a third of a cen¬ 
tury. It required much planning and testing to bring the automobile 
to its present state of efficiency. Today we use adjectives such as the 
following to describe its traits: durable, economical, safe, speedy, com- 

2 Valentine, P. P., The Psychology of Personality, D, Appleton and Company, 
New York, 1927, p. 355. 

3 Jones, Arthur J., Principles oj Guidance, McGrnw-Hill Book Company, New 
York, 1930, Chapter X, p. 148. 



174 


PKRSONALITY AND CIIAllACTER TRAITS 


fortahle, and dependable. The characteristics of the automobile are 
constantly beiny refined so that they can better meet the changing 
social and economic conditions of tiie tunes. As long as automobiles 
arc used tliere -vvlll be a constant need for refinement and adaptation. 
The same is true of a human personality. It is the result of the in¬ 
terplay of many Tariable factors, and even after it is well developed 
it needs constant rcrincmeiit in order to keep adjusted to changing 
eiivirunmciital conditions. 

99. Measuiing Personality and Character Traits. 

Devices for the measurement of iicrsonality and character traits 
have generally taken the form of rating scales rather than of tests. 
However, a number of tliesc devices are so arranged that the individual 
records hi.s own reactions quite objectively. In tliis respect they re¬ 
semble tests. Such instruments differ from the typical rating scale 
also in that the individual responding to the exercises is frequently 
not aware of tlie fact that he is being measured for any particular 
quality or trait. For example, m the Bemreuter Personality Inven¬ 
tory * the subject is asked to indicate his reaction to 125 questions by 
encircling one of the answers Yes, No, or ? which precedes each ques¬ 
tion. The following samples taken from the test itself will illustrate 
the types of exercises used: 

1, Yes No ? Doe.s it make you uncomlortable to be "different” or un¬ 

conventional? 

2. Yes No ? Do you diiy-drcam frequently? 

5. Yea No Do you ever Rive money to beggars? 

1.5. Yea No ? Do you usually object when a person steps in front of 

you in a line of peojile? 

25. Ye.s No 7 Du you study the motives of other people carefully? 

50. Yo.'f No ? Do you usually try to avoid arguments? 

100. Yes No 7 Do you prefer to be alone at times of emotional stress? 

By the arrangement of the material in the test several different 
aspects of personality are measured at one time. According to the 
author, the scales used in the scoring of the responses to the test are 
voiy reliable. This may be due in part to the fact that the traits 
measured are not readily detectable from the test itself. The signifi¬ 
cance of the individual’s response to the questions is brought out by 
the use of four separate scales in the scoring of the answers. For 
example, Scale Bl-N is a measure of neurotic tendency Persons who 

^Bemreuter, Robert G, The Personahly Inventory, Stanford University 
Press, Stanford University, California, 1931, 
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score high on this Lest tend to be emotionally unstable In the ease 
of the exercises used in the sample above, a person who answers ques¬ 
tion 1 with “Yes” scores +2 points. A* “Yes” on question 2 adds 
five more points. On the other hand a “No” for question 5 deducts 
6 points. When Scale B2-S, the scale for self-sufficiency, is used on 
these same answers, however, a rc[)ly of “Yes” for question 1 gives 
a score of —4; a “Yes” on ciuestion 2 giv'cs a .score of -hi; a "No” on 
question 5 gives a score of — 3 points; etc. 

The remaining .scale.s, B3~I for introversion-extroversion, and B4-D 
for dominance-submission, arc applied in a .similar manner. Scoring 
the individual’s response to the questions by each of these four scales 
gives rise to four sets of personality scores for each of which norms 
are available. Persons .scoring high in .self-.sufficiL'ncy are the types 
who prefer to be alone, do not seek sympathy or oncouragoment, and 
tend to follow their own inclinations rather than seek the advice of 
others. Persons scoring high on the introvcnsion-extroversion scale are 
inclined to be iniagiiiutive. Those scoring low on this scale rarely 
worry and prefer to act rather than to dream. Persons scoring high 
on the dominaiice-.submissioii scale tend to dominate others in face-to- 
face situations. 

Analysis of personality such as is afforded by the Bernreuler Per¬ 
sonality Inventory has been used with .success and considerable reli¬ 
ability with bigli-scliool students, college students, and adults. The 
inventory itself is self-administering, there arc no time limits, and 
each person interprets the questions for himself. 

Another type of attempt to secure an unbiased picture of certain 
personality traits without the subject’s being completely aware of the 
traits on which he is to be measured is represented in the Loojhourow- 
Keys Personal Index.'' According to the statement in the m.anual for 
the test itself, it “is an instrument for the detection of attitudes in¬ 
dicative of problem-behavior. It is intended for use m group surveys 
to identify those boys whose personal and social maladjustment is 
such that they arc, or are in danger of becoming, serious disciplinary 
problems.” It is standardized for use in tlie junior-high-school grades, 
although it has been found useful in senior-high-school and in con¬ 
tinuation-school groups. Brief samplings from each of the four test 
parts comprising the battery are given here for illustrative jiurposes. 
The total number of exercises in each test is indicated in the samples. 


’’ Loofbourow, Graham C, Kays, Noel, Personal hidexj Educational Test Bu¬ 
reau, Inc., Minneapolis, 1933. 
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Test 1 

Directing. Tliia is a test, of ymir word knowlcdgo. Put iin X in front of caeli 
word you know. There aru 100 “words. 


—jiGrccivo 
—restore 
—grolo 
—luxury 
—^tgUIc 
—verify 
—proportion 
—galine 
—exceed 
—patient 

Test 2 

Dmctiuim: Below me some word.s and phrases with some sfatomentg about 
each one. Mark an X m front of the one statement under each woul or phrii.se 
which tells lic.st how you feel about the thing named. Mark only one statement 
under each one. 

(Seventeen items.) 


1. Chitnis; 

—It is hard to go without them. 

—You cannot always trus-t fliom. 

—^I’hey sometimes squeal on you. 

—They help you he out of things. 

3. Teachers’ 

—They work hard 
—They know they can punish you. 

—They are not fair to you. 

—They are kind of cianky. 

10. Policemen; 

—^They have it in for the kids. 

—They are glad to help you out. 

—is fun to fool them. 

—^Tlicy are just big bluffs. 

Test 3 

Directions: Read carefully and underline the one response which makes the 
best answer for you. Underline only one. 

(Twenty-one items.) 


1. Do you call another person by a nickname 
he or she does not like? 

2, Do you keep right on studying when the 
teacher goes out of the room? 

21. Do you speak pleasantly to all the people you 
know even if you do not like them? 


Almost 


Hardly 

always 

Sometimes 

ever 

Almost 


Hardly 

always 

Sometimes 

ever 

Almost 


Hardly 

always 

Sometimes 

ever 
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Test 4 

Direcliuns: Answor every question ns Irnthfully iind honestly as you can by 
drawing a line under the right answer, as shown in the samiilea 

A Do you eat more than once a week? Y^es No 

B. Would you rather have a dime than Yes No 

dollar? 

(Eighty-nino items) 


1. Would you like to weiir e.vpensive jewelry, rings, etc. ? 

Yes 

No 

2. Do you feel bored a good deal of the time? 

Yea 

No 

50. Arc you anxious to get away from school and get a 



job? 

Yea 

No 

89. Do you know anybody who is trying to do you harm 



or burl you? 

Yes 

No 

The test is constructed in such a way that tlie undesirable 

responses 


are the ones scored. The “problem” responses were determined by 
comparing the answers given by “problem" boys with those made by 
others of the same age and intelligence. Test 1, False Vocabulary, is 
scored by allowing 1 point for each fictitious word. The possible score 
on this part is 30 points. In Test 2, Social Attitudes, three of the four 
possible answers are socially unacceptable. Each answer so marked 
counts one error. The score is the number of errors multiplied by 2. 
The possible score is 34 points. Test 3, Virtues, is scored by allowing 
1 point for each fault confessed. The score is obtained by multiply¬ 
ing the number of confessed faults by 2. The possible score is 42 
points. For Test 4, the Adjustment Questionnaire, the score is the 
number of “problem” responses. The possible score is 89 points. 

The sum of the four scores listed above gives the subject’s personal 
index. The highest possible index is 195 points. The author’s state¬ 
ment of the significance of these personal indices is quoted from the 
examiner’s manual for the test: “ “An index of 30 or less is clearly 
insignificant as regards problem behavior ... A score of 40 or higher, 
however, strongly suggests an unwholesome trend, since such scores are 
made by three out of four reform school boys, as compared with only 
one m five of the others. Scores of 50 to 60 are much more highly 
indicative, and occur but rarely in unselected groups. 

"By noting those boys who show high personal indexes, say 40 or 
over, principals, teachers, and counselors will have early brought to 
their attention those individuals most likely to become disciplinary 
problems and presumably in gravest need of observation and counsel.” 

The reliability of the battery is approximately .90. 


® Op. cit. 
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100. Rating Scales for Character and Personality Traits. 

In addition to these two tyiies of more or less objective tests de¬ 
signed to reveal personality and character differences a number of gen¬ 
eral rating scales are also in common use. These are of two general 
types. Individuals arc ranked according to their standing in regard 
to the specified character traits, or the character traits arc rated and 
assigned a rank. The best rating scales tend to lessen the spread of 
teacher judgment and when two or three judgments are averaged are 
found to give fairly reliable estimates for an individual’s traits. 
Hollingivorth,’' Shen,’' and Riigg® have all reported studies on the 
reliability of rating character traits. On an average the reliability of 
these ratings is about .55 liut varies from .40 to 70 on the best 
rating .scales depending on the traits rated Whenever possible in using 
rating scales it is desirable to have two or three teachers rate the same 
pupil and then average the ratings. Rugg reports this to be a fairly 
satisfactory method although it is at times difficult to get pupils 
rated by tlirccHlifferciit individuals who know the pupil equally well. 

Self-rating scales are also used for allowing pupils to rate them¬ 
selves. This practice has some value in calling the pupil’s attention 
to desirable traits, and in giving a better understanding of some of 
the desirable and undesirable ti’aits. It has been found that pupils 
tend to rate themselves too high on the desirable trails, but in gen¬ 
eral the reliability of their ratings is about the same as results ob¬ 
tained on other types of rating scales. 

Industrial education teachers have a definite need for measures of 
character and personality traits. Since the scales are the best 
measures available, it seems desirable to use them as one aid in point¬ 
ing out and rating character and personality traits of pupils. It is 
well to bear in mind, however, that the general reliability of the best 
scales is only about .55, and the results obtained are only suggestive 
but are to be preferred to the unaided subjective judgment of the 
teacher. 

The most common use for rating scales in teaching is to study 
pupils who are doing unsatisfactory work, although the results are 
valuable in developing character and personality traits in all pupils. 

’’ IIollingwoTth, H L., Judging Human Ckarncler, D Appleton and Company, 
New York, 1922. 

^ Shen, E., “The Reliability Coefficient of Personal Ratings,” Journal oj Edn- 
caliaiial Pnyrhology, Vol 16.232-36, Apiil, 1925 

"Riigg, H. 0, 'Ts the Rating of Human Character Practicable?” Journal oJ 
EducaUonal Psychology, Vol. 12'425-38, 485-501, November, December, 1921, 
13; 30-42, 81-03, January, Pobruaiy, 1922. 
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Scales have also been developed for rating shop teachers. Seales of 
tins type can be used by the supervisory officers or for self-rating 
and analysis. Teachers and supervisors should keep in mind at all 
times in using the results of rating scales that the average or median 
of several ratings is more reliable than the rating by one individual. 

The values of trait rating are .summarized as follows by Dr. 
Hughes: 

1. Trait rating affords the teacher a better understanding of the individual stu¬ 
dent, The teacher cannot conscienliou.sly fill out die record unless she know,?, 

2 It affords a basis for the modification of school and clussi'ooin procedures. 
If these traits and attitudes arc valuable in education, then the school situations 
and inethoda need to bo adjusted to their development, 

3. It gives a better understanding of special gioiips, such as above-average 
and superior students who are doing poor .school work, oi bclow-avcragc students 
who are doing superior work, which entitles them to membeiship in honor 
societies, etc. 

4. Eollow-up of trait rating bungs out the fact ihat teachers’ miirlos for 
scholastic achievement are based, to a laige e.'cteiit, on the student’s posscssiou of 
desiiablo character traits Those data indicate that teachois slioLild bo trained to 
give marks for scholastic aclucvcmcnt alone, and that othei marks should bo 
devised for ohavaotov traits and attitudes, because they are important enough to 
deseivo separate consideration. 

5. Cooperation of parents lu filling out trait-rating scales for their own children 
■will tend to bring about a belter understanding between the home and the 
school, resulting m better cooperation. 

6. Self-ratmg by the students, on the same scale upon which they are being 
rated by teachers and parents, will tend to turn the students’ attention to the im- 
portanCG of cultivating proper traits and attitudes. Students are inclined to at¬ 
tach importance to things which are being measured, recorded, and used, 

7 Justice in marking and teacher judgments will be more apt to be accorded 
all groups of students when teachcis and counselors have a more accurate knowl¬ 
edge of the chaiacter traits of their students than they could possibly gain by 
their own siibjectii'e judgments, 

8. Trait rating and analysis will result in more scientific counseling, because 
it will help to furnish a wider basis of knowledge and information about the 
students upon which to predicate advice. 

101. Desirable Traits in Industrial Education. 

It is generally agreed that the public school has responsibilities in 
developing and helping to establish desirable character and personality 
traits. The industrial education teacher and his co-workers in other 
instructional fields have a share in this responsibility. We believe, 
also, that they have a definite contribution to make. In the first place, 
it requires time to develop personality. Personality continues to de- 

Hughes, W. Hardin, “Organized Personnel Research and Its Bearing on 
High School Problems," Journal of Educational Research, Vol. 10; 386-398, De¬ 
cember, 1924. 
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vclop as the result of an interplay between the original nature of the 
indiyidual and the environment, regardless of the conscious attention 
given to it. Accordingly, the problem of the industrial education 
teacher who wishes to help develop desirable personalities in his pupils 
is first to find out what tyjies of personalities adjust themselves 
best in our present complex social and economic life. In the second 
place, a desirable personality iiinst be cultivated, and the desirable 
traits must be encouraged. The undesirable ones must be weeded 
out and their expression discouraged. Ilow'cver, it must also be re- 
ineinbcred that personality development is limited by the innate ca¬ 
pacity of the individual, and so its devclojmient may be expected to 
vary markedly under similar environmental conditions. 

102. Constructing and Using Scales for Rating Personality Traits in 
Industrial Education. 

Although scales for rating personality traits have not been ns 
widely used as tests of information, intelligence, and mechanical apti¬ 
tude, several usable techniques have been developed which will aid 
the teacher who desires to construct and use the trait rating scales. 
The first problem in the construction of such scales is the selection 
of the traits to be rated. This selection may be based on the ob¬ 
servation and experience of the teacher, conferences with other in¬ 
structors, conferences with the administrative officers, talking with 
the pupils and their parents, conferences with industrial and social 
leaders in the community, and suggestions from authoritative litera¬ 
ture. After a rather exhaustive list has been prepared, a number, per¬ 
haps twenty, of the most important items should be selected for use in 
the rating scale. This selection may be accomplished through the 
use of pooled judgments of teachers and others interested in person¬ 
ality development. The traits selected must also conform to the gen¬ 
eral purpose of the course of study and be of such a nature that the 
industrial education teacher will have an opportunity to observe the 
expression of the traits in and about the school. For purposes of 
illustration let us consider the following personality traits which were 
selected by the authors after an analysis of the problem. The traits 
are listed and defined in terms of observable pupil responses. 

Self-reliance. This means that there has been developed in the stu¬ 
dent the habit of planning tasks carefully and thoughtfully and of 
carrying them out with only necessary assistance. The problem is 
obviously too difficult before assistance is called for by the student. 

Industry. This means a habit of careful, thoughtful work with¬ 
out loitering or wasting time. 
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Readiness to assume responsibility. This means that a task 
though difficult should not be avoided if worth doing, and when once 
undertaken should be carried through to completion. 

PuncLuality. This means the ability to arrive on time and fit one¬ 
self to a program. 

Cooperation. This means a readiness to assist others when they 
need help, and to join in group undertakings. 

Consideration of others. This means a thoughtful attitude in the 
making of things easy and pleasant for others. It involves keeping 
things m order, putting tools away in good condition, and always 
doing a full share of work where others are involved. 

Cleanliness and neatness. This means the ability to keep physi¬ 
cally clean and neat in both work and dress. 

An optimistic viewpoint toward lije. This means an appreciation 
of the joy of living and a belief that life is worth while. 

After the character traits to be rated have been selected and de¬ 
fined, the next step is to put them into a rating scale which will permit 
the greatest amount of objectivity in scoring. There arc several ac¬ 
ceptable methods of accomplishing this, depending on whether the 
pupils are to be given a relative rank according to their traits, or 
whether the character traits of individuals are to be rated and assigned 
a rank. Dr. Hughes gives the following three procedures used in 
rating: 

Method I. Normal Distribution. In this method we apply the principle 
represented in the “normal curve of distribution ” In any large number of un¬ 
selected cases we find a few who possess a given quality m maximum degree, and 
a correspondingly small number who possess it in minimum degree. A much 
larger number, however, possess the quality m average degree. This general prin¬ 
ciple holds whether we consider height, weight, strength, or any other measurable 
quality or characteristic. For a scale consisting of five equal steps, we should 
have approximately the following distribution of c.ases on a percentage basis: 

Lowest Inferior Medium Superior Highest 

7 24 38 24 7 

But for practical purposes wo have adopted a theoretical distribution as 
follows; 

Lowest Interior Medium Superior Highest 

10 20 40 20 10 

Assuming that the individuals who are to be rated are unselected and rep¬ 
resentative we should have 10 in 100 marked “highest”; 10, “lowest”; 40, 
“medium”; and 20, “inlerior” and “superior” respectively. A convenient method 
of rating such a group is to have the names on individual cards and then 
arrange these cards in five piles according to the percentage distributions required. 

Hughes, W. Hardin, “General Prmciples of Rating Trait Characteristics,”' 
Educational Research Bulletin, Pasadena, Vol. 3, Nos. 5 and 6, February-March. 
1925. 
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Tho nitnr sIioviUl iia fur iis poasihlo ihsinisa fioin iiiiiid cveiy other item of the 
hc'iilo !Uul (‘niueiitriile (Jil th(' one Ijeinir iiiteil. 

The uiedioil of "uoniial diatnbulion” is most usiilile with lai'Ke and iinselcctcd 
iiunibei'S. When the miinber of eases is siiiiill and selecLed the metliod is defee- 
tive. For this icasou iiiiot.hcr iiielhod, bused on tlie same principle, is presented 

Method II. The Master Scale. To n.se tins method, proceed somewhat as 
follows- Suppose tho trait for which a master scale is to be made is industry. 

1. Recall any student known to po.sae.s.s this trait in highest degree. Write his 
name opposite “highc.st’’ in the ina,stoi- scale. 

2. Now’, recall any .student known to jiosse.ss this trait in lowest degiee and 
wriLi' hi.s name opposite “lowest” in the niastei .scale 

3. Then lecall any .student, known to jiosse.ss the trait in average degree, write 
his name opposite; ''medium." 

This gives three dermito Blaiidard.s for comparison. The other places in the 
.scale may now he filled in with names of two students half way between 
“iiu’dium” and “highest” and half way between “medium” and “lowe.st,” ve.spec- 
tively. You now have a muster scale a.s follow.s: 


MtSTEii Scaur tor Indu.stry 


Radiig 

Person 

Numerical 
Value * 

liighcfit 

John Jones 

180 

Superior 

Dick Brown 

MO 

Medium 

Sam Johnson 

100 

Iiifciiur 

Henry Janies 

00 

Lowest 

Bill Smith 

20 


* Tlio iiamcrical valiios lien* iissigiu-d represent llio Imlf-way point in eaeli JO of a 200- 
poiat bcalo. 


With tlii.s master scale in luiiiJ, the Icacher is now ready to rate her students 
in industry. Snppo.se Tom Black is to be rated The teacher cpuekly decides 
whether Tom is us good as John Jonos, as pool as Bill Smith, or just about like 
Sam Johnson, etc. Master scales for the other traits may be made and used in 
tho same way. 

The advantages of this scale are that it ia objective and that small num¬ 
bers of students can be rated without immediate reference to the “normal curve 
of distribution.” In the long run, however, the percentage distributions should 
approximate those given under Method I. 

103. A Useful Personality Rating Scale. 

The authors have found the following type of scale valuable in 
rating personality traits in industrial education classes. It will be 
noted that it uses a form of the graphic method with the quality units 
spaced roughly corresponding to the normal distribution curve. 
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Name- 

Inslriiclor- 


A Graphic Rating Scale 
FOR PensoNALiTA- Th.aits in Inhustrlal Education 

- Date_ 

-- Rating- 


Dircclions. Provision is made for two or iiioie latings of each personality 
tuut on the basis of observable pupil lesponscs Place a check ou the line which 
in your judgment i epreseiits a true cstiiuafe of the present status of the trait 
being latcd 

Minimum Aveiiaub ALaximum 


Self-Reliance 

I I I _^_! _!_ I I I 

Does the pupil plan lus wtnlc oarefiilly iind ilioughlfiilly'? 

I I I _I_!_I_ I I I 

Hops tlio pupil conduct the Avork Avith only necessary hel]V? 

I I I _!_!_!_ I I I 

Does the pupil aak tor help when the problem is too difficult? 

INDUSTIIY 


Is the pupil m the habit of doing careful and thoughLfiil Avork? 


Does he loiter or Avaste time in his work? 

Readiness to Assume Responsibility 

I I I _!_!_I_ I I 

Is Ihe pupil willing to undertake a worth-while task even though it is difficult? 

I I I _!_!_^_ I I 

Docs the pupil finish all his work? 

Punctuality 


Does the jiupil arrive on time to classes? 

I I I _!_!_!_ I I I 

Does the pupil hand hia AVork in on time’ 

COOPEBATION 

I I I _1_!_!_ I I I 

Does the pupil help others when help is needed? 

I I I_ i _!_!_I I [ 


Is the pupil active in group undertakings? 
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CONSIDOUTION OP OtHEHS 


Does the pupil have the habit of iiuiking things pleasant for his classmates? 


Does the pupil help keep the shop m order? 


Docs the puiiil put tools away in the right places? 


When the whole class is involved in some work does ho do his share or skip 
away? 


Clu^nuness and Neatness 

I I I _!_ ! _ L 

Does the pupil wash clean? 

I I I _!_!_L 

Does the pupil dress neatly and in good taste? 

I I I _!_!_ L 

Is the pupil neat in doing his work? 

Optimistic View op Life 

I I I _^^_ L 

Does the pupil have a natural likable smile? 

I I I _!_!_L 

Does the pupil complain about his lot in life? 

1 I I _^_!_L 

Is the pupil liked by his classmates? 


I 


I 


To secure a total rating, score each question on the basis of the following key; 

I 1 I 3 I 7 I 13 I 19 I 23 I 25 I 26 I 

Methods of rating personality traits and of ranking pupils have 
been given and illustrated, but thus far self-analysis by the individual 
has not been discussed except to mention that it is not significantly 
higher in reliability than ratings of traits by outsiders. There is a 
tendency for individuals to rate themselves too high on the desirable 
traits and too low on the undesirable traits, just as there is a ten¬ 
dency of persons rating others they know very well to be influenced by 
the '‘halo effect.” This refers to the tendency of the one rating char¬ 
acter traits to rate them all about the same, depending on the rater’s 
general opinion of the individual. 
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SUMMARV 

This chapter presents a number of suggestions for the rating of 
personality and character traits. Personality in its broader sense is 
developed in an individual by the interplay of stimuli from the en¬ 
vironment and the native capacities of the individual. The thesis of 
this chapter is that a desirable personality can be developed over a 
period of time through careful practice and an understanding of what 
is desirable in a well-rounded personality. Industry, cooperation, con¬ 
sideration for others, self-reliance, readiness to assume responsibility, 
and an optimistic view toward life are suggestive of desirable traits 
to be developed by industrial education teachers as co-workers with 
teachers in other instructional fields. 

Tests of personality traits have not proved as reliable as rating 
scales, and for this reason major emphasis is placed on methods of 
developing and using rating scales. Authorities report the average 
reliability of rating scales at around .55. The most reliable ratings 
arc obtained when the ratings of three or more judges arc pooled. 
Some rating scales rank individuals from highest to lowest according 
to their personality traits; others give a graphic rating of the indi¬ 
vidual traits. In still other types the pupils are allowed to rate them¬ 
selves. None of these methods is highly reliable, but they are better 
than the unaided subjective judgments of teachers. 

SUMMARY EXERCISES FOR DISCUSSION 

1. Distinguish clearly botween. the terms character and personality. 

2. What, in your opinion, is the responsibihty of the industrial arts teacher for 

the developmont of desirable personality and character traits? 

3. Secure from your instructor a copy of the Bernreutor Personality Inventory, 

administer the exercises to yourself, and prepare a .self-analysis on the basis 
of the personality qualities identified in this instrument. 

4. Prepare a personality rating scale including the traits which in your judgment 

(coupled with information gained from reading m the field) are essential 
to success in teaching, 
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CHAPTER XIV 


SUMMARI2ING THE RESULTS OF TESTING 

Experience in the observation of the 'work of individual students 
and in the use of tests in the classroom leads to the conclusion that 
wide differences in pupil accomplishment may be expected. This 
moans that scores representing objective measures of achievement in 
the classroom will vary widely. Since the human mind is not able 
to grasp and hold numerous unlike facts in isolation, it is apparent 
that some fairly simple and accurate mcthod.s of summarizing and 
describing such widely varying results are necessary. This proeess 
of summarizing, analyzing, and compressing data so that they may be 
given adequate description is the application of statistical methods. 

Six statistical techniques are very useful in the analysis and in¬ 
terpretation of educational test results. They are: (1) the classifica¬ 
tion and tabulation of data; (2) the computation and interpretation 
of the common measures of central tendency; (3) the computation 
and interpretation of the more connnon measures of variability; (4) 
the expression of the extent and nature of the interrelations of measur¬ 
able factors; (5) the derivation and use of standards, norms, and 
various derived scores for the purposes of comparison and interpreta¬ 
tion of test results; and (6) the use of simple and effective graphic 
procedures for the presentation of facts. This chapter summarizes 
the discussion of these six points. 

I. THE TABULATION OP TEST SCORES 
104. Need for Grouping of Data. 

The physical, mental, moral, and social unlikencsses of people 
make it impossible to describe them in few words. If all men were 
of tlie same height it would be a simple matter to describe the height 
of men. This need for a method of grouping data for convenience in 
treating and describing the situation is illustrated in the test scores 
given in Table 35 on page 188. This table shows the scores made by 
a group of 27 eighth-grade pupils on the Nemkirk-Stoddard Home 
Mechanics Test, Form A. An examination of this table shows that 
Pupils 7 and 9 made the low and the high scores on this test. The re- 
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maining 25 pupils made scores between these extremes. Even with this 
small class, with scores ranging from 1 point to 16 points, it is ap¬ 
parent that it is difficult lor the teacher to secure a clear picture of 
the achievement of tlie class without some treatment of the scores 
One of the very easiest of these procedm-es is the arrangement of the 
test scores in order of size from the highest to the lowest. This is 
called ranking the scores. These 27 scores have been ranked in de¬ 
scending order of size in Table 36. It enables the teacher to discover 
readily the highest and lowest scores, and, after some training and 
experience, to pick the scores which are more or less typical for the 
group. 

The arrangement of test scores in order of size is helpful only 
when the number of pupils in the class is relatively small, as 10 or 
less. When more pupils are present it u.sually becomes necessary to 
group the scores into the form of a table. This is called making a fre¬ 
quency distribution, and the table naturally is called a frequency 
table. 

105. Steps in Preparing a Frequency Table. 

The three essential steps in the grouping of data in a frequency 
table are as follows: 

1. The determination of the range of the scores 

The range is the difference between the largest score and the small¬ 
est score in the array of scries of scores. For the test scores given in 
Table 35, the range is 13—the difference between 15, the highest score, 
and 2, the lowest score. 

2. The determination of the number and size of the groups in the 
classification. 

TABLE 35 

ScoiiES IN NuMDEn OP Jons Right Made in Eihhth-Ghadu Cliss on 

NeWKIUK-StOIID.VRD TEbT OF HoME MECHANICS 


Pupil 

Number 

Test 

Score 

Pupil 

Number 

Test 

Score 

Pupil 

Number 

Teal 

Score 

1 

G 

10 

10 

19 

8 

2 

3 

11 

7 

20 

7 

3 

8 

12 

6 

21 

5 

4 

12 

13 

6 

22 

10 

5 

3 

14 

9 

23 

9 

6 

4 

15 

5 

24 

5 

7 

2 

16 

7 

25 

8 

8 

6 

17 

11 

26 

6 

9 

15 

18 

6 

27 

7 
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TABLE 36 


Scores in Table 35 Aitii\NGiSD in nEsCBNoiNO OimiAi or Size 


Test Score 

Pupil 

15 

9 

12 

4 

11 

17 

10 

10, 22 

0 

14, 23 

8 

3, 19, 25 

7 

11, 16. 20, 27 

6 

1, S, 12, 13, 18, 26 

5 

15. 21. 24 

4 

6 

3 

2, 5 

2 

7 


The number of groups, or class intervals as they are called, in a 
frequency table depends upon the range of the scores, as well as upon 
the size of the interval which it seems best to use. No specific rule 
covering this situation can be stated. In general it seems safe to say 
that it is usually unwise to group data into fewer than 15 to 18 in¬ 
tervals. On the other hand, the use of 25 or more intervals may in¬ 
troduce an unnecessary amount of labor in making the tabulation, and 
many less than 15 may introduce a serious error oj grouping. Some 
useful suggestions on the relation of the range to the size and number 
of the class intervals to use in making a frequency tabic are given in 
Table 37. 

TABLE 37 


SUGHBSTED RELATION OE RANGE OF ScOHES AND SiZB OF GWSS INTERVALS 


Tor a Range of 

Use a Class 

Interval of 

25 or less 

1 

26 to 69 

3 

70 to 125 

5 

126 to 175 

7 

176 or more 

15 


Error of Grouping. The so-called error of grouping results from 
the practice of putting into the same group, or interval, scores which 
are very widely unlike and in which only a few cases are found. It 
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is greater when the iniinhcr of scores is small and when they are 
found at irregular intervals along the scale. Fur example, the tabula¬ 
tion of two scores of 44 into a class interval of three units (44, 45, 
and 46) with a mid-point of 45 involves an average error of grouping 
of 1.0 point, whereas the tabulation of seven scores of 44, 44, 45, 45, 
46, 46, 46 into the same class interval introduces a smaller error. In 
placing these seven scores in the same class interval it is as.sumcd 
that they will together have an average value approximately the same 
as the mid-point of the step (in this case 45). In this latter example, 
the error is not serious, since the average of these seven scores is 
actually 45.14 instead of 45. In tables with fewer and larger class 
intervals, it must be obvious that this error due to grouping may be 
mueh greater. The underlying idea in grouping scores into 15 to 25 
intervals is to organize the scores into a sufficiently small number of 
classes so that they may be thought about effectively, and yet not 
place them in so few groups that important differences are covered 
up or significant errors of grouping are introduced. 

3. Tabulating the scores. 

The tabulation of test scores corresponds in many respects to the 
filing of letters or cards. The value or size of the score determines the 
filing compartment into which it is placed. The necessity for this 
exact classification makes clear to the student why it is so important 
that the limits of the class intervals be so exactly established m the 
preparation of the steps for a table. There must be no question as 
to the intervals in which each specific score is placed. 

Undoubtedly the best way to dear up the problems of tabulation 
is to illustrate by making use of some actual test scores. In Table 35 
the scores made by 27 eighth-grade pupils on the Newkirk-Stoddard 
Test of Home Mechanics are given. These test scores cover a range 
from 1 to 15 points. The real significance of 27 scores with such a 
range can not be gathered from the form in which they are given in 
Table 35, A. simple form of grouping or compressing those data is 
used in Table 36 which shows the number of the individual pupil 
who made each specific score. As a matter of fact this table is turned 
into a simple frequency table by the process of setting up a scale of 
class intervals in place of the actual test scores and by substituting 
check marks or tabulation marks for the individual pupil numbers. 
Table 37 suggests that a class interval with a step of 1 unit be used 
in arrays in w’hich the range is 25 points or less Accordingly, in this 
problem, class intervals of 1 with whole numbered mid-points are set 
up. Table 38 shows the results of setting up the intervals on this 
basis. The tabulation of the scores is shown in this table in the third 
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cohimn. In tabulating tcist scores the common practice is to make a 
vertical mark {/) for each score of a given magnitude. When the 
frequency of any given score reaches 5, the fifth frequency is indicated 
by a diagonal mark crossing four of the vertical tabulation marks. In 
this manner, the frequencies arc conveniently grouped by S’s, which 
simplifies the summation of frequencies in large populations. 

A further illustration of the tabulation of a series of test scores 
is given in Table 39. The scores used in this case represent total 

TABLE 3S 


Data moM TuitJ3.s 35 and 3G AnK.iNf,T.n in EHEauBNCi' Distuidution 


Class 

Intervals 

Mid- 

Points 

Tabulation 

Marks 

Fi oqucneica 
(/) 

14 5-15 5 

15 

/ 

1 

13.5-14 5 

14 


0 

12 5-13 5 

13 


0 

11.5-12.5 

12 

/ 

1 

10 5-115 

11 

/ 

1 

9 5-10.5 

10 

// 

2 

8 5- 9.5 

9 

// 

2 

7.5- 8 6 

S 

/// 

3 

65-75 

7 

//// 

4 

5 5-65 

6 


0 

4.5- 6.5 

5 

/// 

3 

3 6- 4.5 

4 

/ 

1 

2.5- 3 5 

3 

// 

2 

15-25 

2 

/ 

1 



Total, 

or A = 27 


comprehension scores of a ninth-grade class on the loim Silent Read¬ 
ing Test, Advanced Examination} The test scores of the 71 students 
comprising this class are as follows: 104, 129, 94, S7, 118, 146, 109, 
163, 140. 125, 58, 86, 102, 103, 133, 77, 117, 114, 99, 110, 103, 93, 
123, 137, 89, 118, 117, 107, 109, 117, 114, 162, 135, 115, 101, 109, 
150, 130, 100, 109, 140, 102, 110, 148, 94, 122, 139, 116, 105, 125, 
104, 141, 127, 100, 107, 116, 136, 142, 96, 103, 111, 145, 99, 105, 108, 
98, 126, 112, 162, 114, 109 

The range of the scores is found by subtracting 58, the smallest 

1 Iowa Silent Reading Test, Advanned, Eorms A and B, Wodd Book Com¬ 
pany, Yonkers. New York 
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!3core, from 163, the largest score. The difference is 105. Table 37 
suggests class intervals of 5 units for such a range. Accordingly, a 
class interval large enough to accommodate a score of 163 points is 
set up at the top of the table. Using a mid-point divisible by the 
size of the step means that the mid-point of this interval will be 165 
and that the limits of the interval will be 162.5 to 167.7 points The 


TABLE 39 

Comprehension Scobb.s on Iowa Silent Reading Test, Advanced 
Ekamin.ation 


C!a.sa 

Intorviila 

Mid¬ 

points 

Tabulation 

M.u’ks 

Frequencies 

(/) 

162 5-167,5 

1G5 

/ 

1 

157 5-162,5 

160 

/ 

1 

152.5-157 5 

1.65 


0 

1-17 5-152.5 

160 

/// 

3 

142,6-147 5 

145 

// 

2 

137 5-142.6 

140 

m 

5 

132,5-137 5 

135 

//// 

4 

127.5-132 5 

130 

It 

2 

122.5-127.6 

125 

rm 

5 

117 5-122.5 

120 

m 

“ 3 

112.5-117 5 

115 

nu //// 

9 

107 5-112.5 

no 

tm m / 

11 

102.5-107,5 

105 

m /// 

8 

97.5-102 5 

100 

fM. m 

8 

92,5- 97 5 

95 

nu 

4 

87.5- 92 5 

00 

/ 

1 

82.5- 87 5 

85 

// 

2 

77.5- 82.5 

SO 


0 

72.5- 77 5 

75 

/ 

1 

67-5- 72.5 

70 


0 

62 5- 67.5 

05 


0 

57 6- 62 5 

60 

/ 

1 

AT = 71 


remainder of the table is developed in a similar way until the com¬ 
plete series of 21 intervals necessary for this range of scores is built 
up. The table must provide for the entire range of the scores, from 
the largest to the smallest. The limits of the class interval at the 
bottom of the table are 57.5 to 62.5, which provides for the score of 
58 at the lower end of the range of scores. Table 39 shows the entire 




THE AHITHMETIC MEAN 


193 


range of the class intervals, the mid-points, the tabulation marks, and 
the frequencies, based on these 71 reading-test scores.” 

II. MEASURES OF CENTRAL TENDENCY 

The second of the important statistical techniques required in con¬ 
nection with the summary of educational test results deals with the 
computation and interpretation of the common measures of central 
tendency This is the process of computing a single measure or term 
which may be used in describing the complete array of data in the 
table. The term central tendency arises through the fact that these 
measures are commonly found near the center of the distributions of 
scores when the scores are arranged in order of size. 

Three measures of central tendency are commonly used in the in¬ 
terpretation of educational tests. These are: the arithmetic mean, the 
median, and the mode. In general, these measures are named in the 
order of their use in present-day test interpretation. As a matter of 
fact, the mode is considered to be such an unreliable measure that it 
is rarely used m educational ineasiireincnts. In this discussion, con¬ 
sideration will be given only to the first two of the measures of central 
tendency, namely, the arithmetic mean and the median. 

106. The Arithmetic Mean. 

Almost everyone knows how to find a simple arithmetic mean or 
average, as it is commonly called, by dividing the sum of a scries of 
measures by,the number of measures. However, not everyone knows 
that there is a rapid and reasonably accurate method of computing 
the arithmetic mean of large numbers of measures in frequency tables. 
The speed and the satisfactory accuracy with which this important 
measure of central tendency may be computed for distributions of 
large numbers of cases has made it one of the most popular and useful 
of the measures of central tendency. 

The calculation of the arithmetic mean from a frequency distribu¬ 
tion requires a somewhat different concept of the measure than that 
used when it is computed from ungrouped data. For the 27 test scores 
given in Table 35 the sum of the measures is 191. The arithmetic 
mean, the result of dividing 191 by 27, is 7.07. When computed in 
this way the arithmetic mean does not especially require definition. 
It is easier simply to state how it is found. When computed by the 

“ A much more detailed discussion of the problem of tabulating test scores 
will be found in Greene’s Work-Book in Educational Measurenionts (Longmans) 
Extensive practice in tabulation of teat scores is given in Problems 1, 2, 3, and 4 
of Work-Unit I of the above-mentioned Work-Book. 
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so-callcd .shorter method, the arithmetic mean is defined as a point on 
the scale such that the sum of the deviations of the values larger 
exactly equals the sum of the deviations of the values smaller than 
it IS. For those who think most, cdcarly in concrete terms this arith¬ 
metic mean may be tunsiflcred as the point in a beam of irregular but 
increasing thickness at which the fulcrum must be placed to bring it 
into perfect balance. 

The actind computation of the arithmetic mean from a frequency 
table proceeds on the principle of the mathematical determination of 
the pro]ier jiosition of the fulcrum of such a beam from data resulting 
from a trial balance. That is, the beam is suspended on the fulcrum 
as nearly as can be detennined by c.stinuition, then the acLurd amount 
that the beam is out of balance is measured. Experience shows that 
the fulcrum must be moved m the direction of the heavy end of the 
beam in order to bring it into balance. The exact amount of this 
sliift in position depends upon the difference m the forces bearing on 
the two ends of the beam. If there arc 60 units of unbalanced force 
tending to turn a beam in a cci-tain direction and there ai'e 40 measures 
(scores) contributing to the distribution, it means that the fulcrum 
must, be moved toward the heavy end of the lieam an amount equal 
to 1 5 scale units of length (60 -i- 40 = 1.5). This should bring the 
beam into balance. 

107. Steps in Computation of Mean. 

The specific steps to be taken in the computation of the arithmetic 
mean by the shorter method are as follows: 

Step 1. Select the mid-point of some central step on the scale as 
an assumed mean. Call this point zero. (In computing the arith¬ 
metic mean from a frequency distribution the scores in a given step 
arc all assumed to be grouped at the exact center of the step, hence 
the assunqition of the mid-point of the step as the zero). 

Step 2. Mark off steps of deviation above and below this assumed 
zero point, maintaining the algebraic signs. 

Step 3. Multiply the frequency in each step by the deviation of the 
step Carry the algebraic signs of these deviations. Those above the 
zero step should be plus; those below it should be minus. 

Step 4- Find the algebraic sum of these deviations, keeping the sign 
of the result, 

Step 6. Divide this value by the number of cases in the distribu¬ 
tion, and multiply this result by the number of units in each step. 
This rc.sult is the correction (c). 

Step 6. Depending upon the sign of this correction (c), increase or 
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decrease the value of the mid-pomt of the step taken as the zero by 
the amount of c. This should give the arithmetic mean. 

This procedure may be marie clear by actually working the mean 
of the test scores tabulated in Table 38. The work is .shown in detail 
in the accompanying table (Table 40). 


TABLE 40 


Di.STiiinUTiON or 

’ ScmiE.s Tikbn- 

FitOM T.1UI.C 35 

.\ND TUE 

ClLCUnCTlON OF 



Aiiithmetic Mb\n of tuk 

27 .ScoiiES 


Clap.s 

Intervals 

/ 

d 

jd 

14 5 

(15) 

15 5 

1 

-f 7 

7 

13 5 

(14) 

14.5 

0 

+ 0 

0 

12.5 

(13) 

13 5 

0 

+ 5 

0 

115 

(12) 

12.5 

1 

+ 4 

4 

10 5 

(11) 

11 5 

1 

-hs 

3 

9.5 

(10) 

10 5 

2 

+ 2 

4 

85 

(9) 

05 

2 

-hi 

2 

7,5 

(8) 

85 

3 

0 

(+20) 

65 

(7) 

7 5 

4 

— 1 

— 4 

55 

(6) 

65 

6 

— 2 

— 12 

45 

(5) 

55 

3 

— 3 

-9 

35 

(1) 

45 

1 

— 4 

—1 

2,5 

(3) 

35 

2 

— 5 

— 10 

15 

(2) 

25 

1 

— 6 

-6 



N 

= 27 


(—45) 


Step 1. Assumed mean = 8 0. 

Step 2. Lay oH deviation.-! 

Sla-p 3. Add plus and nunua /d’s. 

Step 4. kind algebraic sum of/(/’s. —45 + 20 = — 25 

_25 

Step 5 Divide this algebraic .sum by N ~ — 0.926. 

Multiply by size of the step —0 920 X 1=-—0 926. 

Step 6 8 000 
0 920 

7 074 = Arithmolie incaa. 

108. The Median. 

The simplification of the work involved in computing the arith¬ 
metic mean has done much to stimulate its general use in the interpre¬ 
tation of educational test results in place of the median. However, 
the case with which the median may be obtained, and the fact that it 
does not give undue weight to extreme scores as does the aritlimetic 
mean, have made it a popular measure for use in educational measure¬ 
ments. 
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Some confusion has been created in the minds of students and 
teachers through a lack of consistency in the methods of computing the 
median. For many ycai’s it was common practice to instruct users of 
tests to arrange the test papere for a class with the scores m descend¬ 
ing order of size, and take tlie score on the middle paper as the score 
best representing the achievement of the class. For a long time this 
score was called the medkm As a matter of fact, the measure of 
central tendency obtained in this way from ungroiiped data is a crude 
median, but in order to distinguish it from the true median computed 
by data in a frequency distribution there is a tendency to call it the 
mid- 7 ncmure. The mid-incanure is a counting median found from 
imgrouped data. The yncdmn is always computed from groiqicd data. 

Compulinij the j\Hd-^^casllrc. By definition the mid-measure is 
the score 0 / the middie -paper when the -number of test papers is odd, 
and the average of the two scores nearest the middle when the num¬ 
ber is even, assuming that the test papers are arranged in definite 
order of magnitude. The method of computing this very simple 
measure may be illustrated by referring to the data given in Tabic 35. 
The 27 test scores given in this table arranged in descending order 
of size are as follows; 15, 12, 11, 10, 10, 9, 9, 8, 8, 8, 7, 7, 7, 7, 6, G, 6, 
6, 6, C), 5, 5, 5, 4, 3, 3, 2. The mid-moasnre is found by counting off 
the scores until the middle paper is reached. In this case, it will be 
the score on the fourteenth paper, or 7 points. If there were only 
26 papers and the high score of 15 were missing, the average of the 
thirteenth paper from either end of the scale would be used as the 
mid-measure.® Under these conditions the mid-measure would thus 
be the average of 6 and 7, or 0,5 points. 

Computing the Median. The median is defined as a point on the 
scale such that exaeily GO per cent of the cases in the distribution are 
above it and 50 per cent of the cases are below it. The median is 
distinguished from the mid-measure by the fact that the former is a 
point on a scale whereas the latter is an actual score on a test paper 
(or the average of the two scores lying ncare.st the middle paper). 
The fact that the score on the middle pajier of a series is not the same 
thing as the middle point in the scale of a frequency table of the same 
scores makes it important that the two types of measures be defined 
and distinguished in use. It will be a movement in the direction of 
uniformity of interpretation of test results if the median is always 
understood as being computed from data in a frequency distribution. 

a See Problem 11 in Greene. Work-Book in Educaho-nal Measurements (Long¬ 
mans), for uddilioniil drill on computing the mid-measure. 
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In the earlier discussion of meLkods of tabulation, and particularly 
in the explanation of the computation of the antlmictic mean, it was 
pointed out that all the measures falling in a given step arc assumed to 
have the value of the mid-point of that step. This is necessary since 
the computation of the arithmetic mean involves the correction of an 
assumed mean and this correction may take place in cither a positive 
or negative direction. Now, in computing the median a very different 
assumption is made, and since tins point frequently causes consider¬ 
able difiiculty and confusion, the reasons for making it arc explained 
here in some detail. Since the median is a counting measure ainl is 
obtained by counting into tbc distribution until a point is reached 
which throws one-htilf of the frequencies below it, it is necessary to 
assume that all scores assigned to a given step are distributed imi- 
fornily throiKjhout the step. When working with the median, or 
measures of a similar character such as percentiles, all scores are 
assumed to be scattered through the step in this uniform manner. It 
may help to think of the steps or class intervals as air-tight compart¬ 
ments, and the scores or frequencies assigned to the steps as a volatile 
gaseous substance which ex])ands and completely fills the compartment, 
regardless of how many or how few the frequencies may be. If four 
scores are assigned to a given step, each one of the eases represents 
one-fourth of the total area of the step. If there are 20 cases per 
step, each case is considered to represent one-twentieth of the area 
of the step in computing the median. 

To find the median of a series of scores arranged in a frequency 
table take the following steps: 

Step 1. Divide the total number of cases in the distribution (lY) 
by 2 to determine 50 per cent of the cases. (Sec Table 38, N/2 is 
13.5.) 

Step 2. Beginning at the bottom of the column of frequencies, 
count up the frequencies as far as possible without exceeding the half¬ 
sum {N/2). (In Table 38 the frequencies l + 2+ l + 3-t-6 equal 
13, which is still less than the half-sum, 13 5) 

Step 3. Take the difference between the half-sum and the sub¬ 
total. (In this example the difference between 13.5 and 13 is 0.5 
point.) This difference shows the number of cases which must be 
taken from the step in which the median is located. Since there are 
four cases in the next step, the step with the limits 6.5 to 7.5, one-half 
a case out of these four cases must be taken. Thus the median is 
located 0.5/4 or one-eighth of the way through the step. This shows 
the proportion of the step which must be passed in counting the fre- 
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quoni'ies in order to rcucli a point on the test sriile such that exactly 
ono-lialf of the eases lie below it uutl one-half lie above it 

Step Ji- Since the four eases in this step are assumed to be dis¬ 
tributed uniformly throughout the ateji, the fraction 0 5/4, or oiic- 
eiglith, represents the fraction of the step which must be passed in 
order to reach the point known as the median. The fraction one- 
eighth is ccpiivalcnt to the decimal 0.125. Hincc this value 0,125 is 
in stops and the steps in this table, are one-point intervals this value 
must he multiplied by 1. This means that the median is 0,125 unit 
beyond (above) the lower limit of the step into which this fractional 
unit is taken. 

Step 5 The heginning (lower limit, since the frequencies were 
counted from the lower ciul of the distribution) of the steji having the 
four frequencies is G.5. Therefore the 0.12.5 unit must be added to the 
vahio 0,5. The median thus becomes 6.025. For practical purposes 
the decimal may he rounded to 6.(i3. 

Step 6. In stati.stical work of all kinds accuracy is extremely im¬ 
portant. It is therefore very desirable to check all computations. The 
calculation of the median may he conveniently checked by the simple 
procc.sa of adding the frceiueucics down from the top of the distribu¬ 
tion, In this ease the interpolation would be 0 875 and would be 
subtracted from the top of the step, or from 7.5.^ 

The actual work of computing the median of the silent-reading 
scores given in Table 39 is given in detail in Table 41. 

109, Uses of the Arithmetic Mean and the Median. 

Tlie question which of these very useful measures of central ton- 
deiiey to use in test interpretation frequently arises. In many respects 
there is not a great deal of choice. Prior to the general adoption of 
the shorter methods of computing the arithmetic mean the median was 
very popular. It is simiile to compute, and furthermore, is considered 
especially suitable for test interpretation because of the fact that 
widely deviating scores do not unduly influence it. On the other hand, 
the arithmetic mean is now very easily calculated, and for most ex¬ 
perimental purposes it appears to be quite important to have each 
individual score given weight in the results in direct proportion to 
its magnitude. The greatly increased use of educational tests for ex- 

*This and a mimbor of other points in the computation of the median are 
di.‘ir;u.sj-pd and explained m connection with llhistrntions 7, 8, and 9 (pages 35 to 38 
inchi.sive) in the Woik-Book in Eduraliimal Memwrnivnts by H. A Gicone 
(Longman.s) Problems 12, 13, 14, and 15 m this Wotk-Bonk also provide ex¬ 
tensive drill on the finding of medians from all types of distributions. 



USES OF THE ARITHMETIC MEAN AND THE MEDIAN 199 


TABLE 41 


Distribution of Test Scores of 71 Nintii-Guade Pupils. Tot.\l Comprehension 
Scores from Iow.v Silent Readinu Test; Aiivancui 


Class Intervals 

} 


162 5-167 5 

1 

71 

Step 1, TT—35 5 = half-sum. 

157.5-162 5 

1 

Z 

152.5-157.5 

0 

Step 2. 1-I-1 + 2+l-f4d-S-l-8 = 25. Subtotal. 

147 5-152 5 

3 


142 5-147 5 

2 

Step 3 : 355 — 25=105 

137.5-142.5 

5 

„ . 10 5 

132 5-137 5 

4 

Step 4. — -—0 954 X 5 (.size of step) “-4 77. 

127 5-132 5 

2 


122.5-127.5 

5 

Step 5; 107.5-1-477r-11227 = median. 

117 5-122.5 

3 


112 5-117.6 

9 

Step 6; Check. 

107.5-112 5 

11 

355 — 35 = 0.5 

102 5-107.5 

8 

0.5 

TT X 5=^023 

97 5-102 5 

8 

11 

92,5- 97 5 

4 

112.5 — 0.23 = 112 27 modiiin 

87 5- 92.5 

1 


82 5- 87 5 

2 


77,5- 82.5 

0 


72 5- 77.5 

1 


67 5- 72 5 

0 


62,5- 67.6 

0 


57 6- 62.5 

1 



N=:71 

perimental purposes has thus naturally teniled to increase the popu¬ 
larity of the arithmetic mean as a measure of central tendency. In 
general, and m the absence of any other guiding principle, use the 
median in all interpretations or comparisons in which the median 
itself was used in securing the basis for comparison. That is, if test 
results arc to be compared with test norms which arc based upon 
medians, then the medians of the test results should be computed and 
used. Comparative norms based upon means may well be compared 
with class means. For most experimental purposes the arithmetic 
means should be used, particularly where the scores of individuals are 
compared with their own scores obtained under experimental controls. 
In most experimental studies it is desirable for all measures to receive 
consideration, and furthermore, in most such studies other measures, 
as the standard deviation, are required. Since these additional statis¬ 
tical measures are usually based upon the same processes as those used 
in calculating the mean, the arithmetic mean is the logical measure 
to use. 
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III. MEASURES OR VARIABILITY 

The need for measures of variability in the interpretation of test 
scores arises through the fact that two groups of pupils may earn 
scores on a test which will have the same medians or arithmetic means 
and yet represent distinctly different types of instructional situations. 
At least two types of descriptive measures of a distribution are needed 
before all its essential features can be revealed. The measures of 
central tendency reveal the points on the scale where the typical scores 
are most likely to be found. Some method of expressing variability 
is required to reveal differences in range of talent. 

The two groups of scores presented as Class A and Class B in 
Table 42 illustrate this situation very clearly. The means of the two 


TABLE 42 

iLLUSTILCnoN OF NkKI) FOR Mf.^SURKS OF VAMADlLITy 


CI.T.SS A 

Class B 

122 

98 

116 

90 

108 

95 

101 

93 

96 

90 

92 

89 

89 

87 


S6 Means 86 


S3 

85 

SO 

S3 

76 

82 

71 

79 

64 

77 

66 

76 

50 

74 


series of scores are identical, each being 86. The range of the scores 
for Class A is 72 (122 50), which is exactly three times the range 

(98 — 74 24) of the Class B scores. The quartile deviation com¬ 

puted from the ungrouped scores is 15 for Class A and 7 for Class E. 
The standard deviations of the scores are 20.16 and 7.3 for the Class 
A and Class B scores respectively. Even the most inexperienced 
teacher or student must recognize that very different ranges of ability 
are present in these two classes and that correspondingly different in¬ 
structional problems are presented to the teacher. 
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110. The Range. 

Of the three commonly used measures of variability mentioned in 
the illustration in the previous paragraph the range is the easiest to 
find and the least useful measure. The range is the scale distance be¬ 
tween the lowest and the highest scores in an an-ay. The very defini¬ 
tion of the range makes it apparent that it is one of the least reliable 
measures, since it is so readily affected by the fluctuation of either of 
the extreme scores. In arrays of test scores or frequency tables in 
which the scores fall into line quite regularly the range may be a 
fairly consistent measure. In the illustration given in Table 30 the 
range is almost as effective m revealing the spread of ability as the 
standard deviation, which is usually considered to be one of the most 
reliable measures of dispersion. This, without doubt, may be traced 
to the consistency with which the test scores vary above and below 
the means. It may be sufficient to point out here that the range is 
rarely used as an evidence of dispersion or variability in the inter¬ 
pretation of educational-tcst scores since such scores rarely fit into the 
scale with the consistency shown in the illustration in Table 30. 


111. Quartile Deviation. 


The particular merit of the semi-interquartile range or quartile 
deviation (Q) as a measure of the variability of test scores lies in the 
fact that it utilizes the range of the middle half of the cases rather 
than the range of the extremes. In actual practice the range of the 
middle half of the cases is found by counting off frequencies from the 
lower end of the distribution until a point cutting off 25 per cent of 
the cases is located. The method of finding this point is identical 
with the procedure in computing the median except that only 25 per 
cent instead of 50 per cent of the cases in the distribution arc con¬ 
sidered. This point is commonly designated as Q^. A point on the 
scale which cuts off 25 per cent of the cases from the top of the dis¬ 
tribution is found in a similar way. This is known as Q,. The re¬ 
maining cases included between these two points are the middle 
60 per cent. The reliability of this measure lies therefore in the fact 
that it is based upon the portion of the distribution in which the 
density of the population is greatest. 

The quartile deviation (Q) is found by taking one-half of the dif¬ 
ference in the scale values of the points Q, and Qi. The formula for 


this measure of variability is (3 = 



The computation of Q 


is illustrated in terms of the data presented in Table 41, and since the 
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procedures are essentially the same as those used in finding the median 
the stejis arc summarized very briefly. 

Step 1. Find 25 per cent of the cases. In this problem one-fourth 
of the ca.ses equals 17.75. 

Step 2. Summatc the frequencies beginning at the bottom until a 
point not m excess of 17 75 cases is reached. 1 + 1 + 2+1 + 4 + 8 
= 17. The difference between this subtotal and 17.75 is 0.75 case. 

Step 3. 0.75/8 tiniea 5 equals 0.47 unit. 

Step 4. Add 0,47 unit to the beginning of the interval in which 
the 8 eases arc found. Thus, 102.5 + 0.47 equals 102.67, which is Qj 
for this distribution. 

Step 5. vSumniatc the frequencies beginning at the top of the dis¬ 
tribution. 1 + 1 + 3 1-2 + 5 + 4 = 16. The difference between iV/4 
or 17.75 is 1.75 cases. 

Ste 2 ) 6. 1.75/2 timc.s 5 equals 4.38 units. 

Step) 7. Since this computation of Qa is proceeding from the top 
of the distribution the 4.38 units must be subtracted from the top of 
the step into which the interpolation is made. Thus, 132.5 — 4 38 = 
128.12, which is Qa fur this distribution. 

Step 8. Qn — Qi or 128.12 — 102.67 equals 25.45, which is twice 
the value of the scmi-intcrquartilc range. Thu.s 25 45 divided by 2 
equals 12.73, or the quartile deviation for this distribution. 

The values Q and the median (Q 2 ) are frequently confused. They 
are quite different measures, however. The median or fiftieth per¬ 
centile is a measure of central tendency; Q is a measure of the varia¬ 
tion. Q expresses the variability of an array in terms of the aimrage 
distance from the center of the distribution it is necessary to go in 
either direction to include the middle half of the cases. 

112. Standard Deviation. 

Such simple devices as the range and quartile deviation (Q) are 
sufficiently exact for many ordinary situations involving the interpre¬ 
tation of test results. However, other statistical analyses demand 
more refined measures of variability. For this type of work the 
standard deviation is generally used. The standard deviation is the 
square root of the mean of the square of the deviations from the mean 
of a distribution. Expressed in symbols the standard deviation is 

-C- in which equals the deviations expressed in the 

form of the sum of the products of the frequencies at each step by the 
deviation of each step from the assumed mean; N equals the niunber 
of cases in the distribution; c equals the correction used in computing 
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tli 0 arithmetic mean; and s represents the size of tlie class interval 
of the distribution in units. 

The standard deviation may bo computed about any common 
measure of central tendency, although in common practice it is usually 
computed about the arithmetic mean. There is at lea.st a theoretical 
advantage in using the mean as the point around which to determine 
the standard deviation. The aritlinietic mean is the point in a di.s- 
tribution about which the deviations are the least. 

113. Meaning of the Standard Deviation. 

The likenesses and differences of the quartile deviation (Q) and the 
standard deviation (a) arc shown in Tig. 21. The quartile deviation, 
or semi-interquartile range, by definition takes into account the middle 



50 per cent of the cases. That is, the ordinates (lines erected per¬ 
pendicular to the base line of the curve) erected at a scale distance 
equal to Q on either side of the mean or median include 50 per cent 
of the area of the surface between the curve and the base line. In 
the diagram the lines K and L represent the ordinates erected at a 
distance equal to Q on either side of the mean. The lines M and N 
represent ordinates erected at a distance cciual to o- on either side of 
the mean. The standard deviation (u) takes into account approxi¬ 
mately 68 per cent (in a normal distribution 68.26 per cent) of the 
area of such a distribution. 

In a normal distribution the value sigma bears a definite relation¬ 
ship to the curve of the distribution itself. When a nonnal distribu¬ 
tion is presented in graphic form the result is a symmetrical bell¬ 
shaped curve with many cases in the middle and few at the extremes. 
Certain types of these characteristic bell-shaped distributions have 
come to be called normal curves. For these normal curves, formulas 
have been derived from which such typical curves may be computed 
if certain basic data concerning the curve are given. In these formu- 
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las sigma is ouc nf the values wliicli must be given in order to enn- 
struet sueli a eiirve. Sigma, in the typical formula, represents the 
distance from the mean at which the curve clianges from convex to 
concave. In Fig. 21 the points whore the curve changes its char¬ 
acter are indicated by the ordinates lettered M and N. 

Thus, because of this direct mathematieal relationship wliicli the 
standard deviation liears to the curve of the distribution itself, and the 
reliable expression of variability which it providc.s due to the fact 
that every deviation in the flistribuLion is considered, the standard 
deviation is one of the most useful of the measures of variability. 

114. Computing the Standard Deviation (a) from Ungrouped Data. 

In the computation of the standard deviation from ungrouped data, 
as in the accompanying illustration, the mean nf the distribution must 
be found. When the data arc grouped in a freciucney table it is not 
strictly necessary for the arithmetic mean to be computed, although it 
is necessary to go through all but the last step of the process. 

The steps in the computation of the standard deviation from un- 
grouped data arc given in detail in connection with data from 
Table 42. Sec Table 43. 

TABLE 43 


D.wa fob Class A fkom T.adle 42 


Test 

d 



Scores 

(Deviation) 

(Deviations 

Squared) 


122 

-f 36 

1298 


118 

+ 30 

900 



108 

101 

+ 22 
+ 15 

484 

225 

V jy 

98 

+ 10 

100 


92 

+ G 

36 


89 

+ 3 

9 


86 

0 

0 

^ 15 

S3 

— 3 

9 


80 

— 6 

36 

= v^406 67 

78 

— 10 

100 

71 

— 15 

225 


64 

— 22 

484 

= 20.17 

56 

— 30 

900 


50 

— 36 

1296 



Total 1290 
Mean 38 


Sd2 = 6100 
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The mean of the scores fur Class A in Table 43 is 86. Thus a 
score of 89 deviates from this mean by 3 jioints. A score of 96 devi¬ 
ates 10 points, etc. The standard deviation (tr) is the mean of the 
squares of these deviates from tlie mean of the array of scores. It is 
necessary therefore to square each of these deviates. These arc given 
under the column headed d^. Since each deviation appears only once 
and the data are ungrouped, the formula may be simplified to read 
a = yxd^/N. The sum of the deviations squared is 6100. The 

mean of these deviations is therefore 406.67. This value is the mean 
of the deviations squared. Therefore to turn it into units of the scale 
the square root of this quantity must be taken. This value is found 
to be 20.17, which is the standard deviation (a) of this senes of 
scores. The mean of this distribution is 86. The o- is 20.17. This 
means that, between scores 20.17 points larger and 20.17 points 
smaller than this mean, approximately two-thirds (68.26 per cent) 
of the scores will be found. 


115. Computing the Standard Deviation from Grouped Data. 

The method of computing the standard deviation from ungrouped 
data illustrated in Table 43 may be applied with very few changes to 
the calculation of sigma from grouped data. A slight change in the 
general formula is required, for, when the scores are grouped in class 
intervals, the deviations of the scores must be considered by groups 
having the mid-points of the steps in wdiich they are found. This 
permits the expression of the deviations in steps rather than in units 
of the scale. The formula for use in calculating the standard devia¬ 
tion when the data are grouped in a frequency distribution is 

s ~ The steps in the application of this formula in the 


calculation of the standard deviation of the scores originally presented 
in Table 38 will make clear all the processes involved in finding the 
sigma of a frequency distribution. The computations themselves are 
shown in Table 44. 

Step 1. Assume a mean as near as possible to the true mean of 
the distribution in order that the correction (c) may be as small as 
possible. If the correction is larger than 0.5 step it may be desirable 
to start the work over and assume a mean nearer the true mean. In 
Table 44 the step with a raid-point of 7 is taken as the zero step. 

Step 2. Lay off deviations above and below the step assumed as 
zero, and multiply the frequencies in each step by the deviation of 
the step exactly as in the calculation of tlie arithmetic mean. 

Step 3. Summate plus fd’s and the minus jd’s algebraically. The 
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TABLE -11 
Data hiom Tvui.b 3S 


Cl!1^8 

InLorviLl.8 

Mid¬ 

points 

1 

d 

Jd 

jdi 

14 5-15.5 

IS 

1 

+ 8 

8 

64 

13.5-14 5 

14 

0 

+ 7 

0 


12.5-13.5 

13 

0 

-ho 

0 


11,5-12 5 

13 

1 

+ 5 

5 

25 

10.5-11.5 

11 

1 

+ 1 

4 

10 

9 5-10 5 

10 

2 

-I-.3 

6 

18 

8.5- 9.5 

9 

2 

-1-2 

4 

8 

7.5-8 5 

8 

3 

+ 1 

3 

3 

0,5- 7 5 

7 

4 

0 

T 30 


5,5- 0 5 

6 

6 

— 1 

6 

0 

4 5-65 

5 

3 

— 2 

6 

12 

3,5- 4 6 

4 

1 

— 3 

3 

9 

2 5- 3,5 

3 

2 

— 4 

8 

32 

1,5- 2 5 

2 

1 

— 6 

5 

25 



JV = 27 


— 28 

218 



30 — 28, 


27 

c = 

0.005. 


_ 218 

N ' 

~ 37 - 

8 037 

— 0.005 


^^= + 0 . 074 , 


: 8 037. 
= 8032. 


aBtcp9=8 032 = 283. 

Step of 1 in tiible, therefore a = 2.83. 


sum of the fd's. in tiiis problem is + 2 units. Divide this by the num¬ 
ber of cases in the table (N = 27), and the resulting correction (c) 
is + 0.074. This correction is the same as the one used in computing 
the mean. 

Step 4. Square this correction in order to have it in the same de¬ 
nomination as the values from which it must later be subtracted. The 
square of c (-1- 0.074) is 0.005 in. this problem. 

Step 5. Multiply the values under the column headed fd by 
the values under d. This will give a column of values known 
as /db Summate this column. In this problem the sum of the Jd^ 
is 218. 

Step 6. Divide the sum of the by N, the number of cases in 
the distribution. The result of this division is 8.037. 
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Step 7. Since this value, 8.037, is always too large m projiurtiun 
to the amount the true mean deviates from the assumed mean (m this 
case, the amount represented by the value c) it must be reduced an 
amount equal to the square of c. Thus, 

8.037 - 0.005 ^ 8 032. 

Step 8. The sigma of this distribution expressed in steps is next 
obtained by extracting the square root of the value 8.032. The square 
root of this value to two decimal places is 2.83. Since the class inter¬ 
vals used in this frecjuency table are steps of one unit, the standard 
deviation is therefore 2.83.'’ 

116. Using the Standard Deviation. 

Assignment of Class Grades. The student or teacher who is inter¬ 
ested in the critical analysis of test scores will find the standard devia¬ 
tion a very useful and reliable instmnent for the purpose. For ex¬ 
ample, it offers the basis for a practical plan for turning scores on 
objective tests into class marks. The importance of this practice is 
so great that the steps involved in the technique are given in detail. 
The computations described are based upon the objective test scores 
from a class of 45 pupils given in Table 45. The student will do well 
to check all these eomputations for errors. 

Step 1. Prepare a suitable frequency table of the test scores, lay 
off the deviations from the assumed mean, and find the sum of 
the /d’s and the arithmetic mean. The mean of this distribution 
is 68.55. 

Step 2. Compute the standard deviation (ct) of this distribution by 
multiplying the /d’s by the deviations in steps, summating the fd^’s, 
dividing this sum of the fd^’s by the number of cases, subtracting from 
this quotient the square of the c used in finding the arithmetic mean, 
extracting the square root of this remainder, and multiplying this root 
by the size of the step used in the table. The standard deviation of 
this distribution found in this manner is 19.40. 

Step 3. Since a distance of two and one-half sigma units above 
and below the mean includes almost 99 per cent of all cases in a dis¬ 
tribution this number of sigma units is laid off above and below the 
mean. This naturally results in placing one of the sigma units in the 
middle of the distribution in such a way that one-half of the sigma 
distance of the middle unit extends above the mean and one-half 

Problfims 19, 20, and 21 in Greene's Work-Book in Educational Measure¬ 
ments (Longmans) furnish excellenl supplementary practice in the computation 
of the standard deviation. 
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TABLE 45 



Standaiid 

DBVnTKIN 

Technique fou 

Assioning Class 

G HADES * 


Test 


Mid- 

Class 



]d 


jd^ 

Scores 

Poiuts 

Intervals 

1 

d 


109 


110 

107.5-112 5 

1 

10 

10 


100 

lOi 


105 

102 5-107.5 

2 

9 

IS 


162 

103 

A(ll.l%) 

100 

97.5-102 5 

2 

8 

16 


128 

102 

95 

92 5- 97 5 

4 

7 

28 


196 

99 


90 

87 5- 92.5 

0 

6 

0 


0 

95 


85 

82 5- 87 5 

2 

6 

10 


50 

95 


SO 

77 5- 82 5 

2 

4 

8 


32 

94 


75 

72 5- 77.5 

3 

3 

9 


27 

03 


70 

67 5- 72.5 

3 

2 

6 


12 

84 

3(17.8%) 

05 

02 5- 07 5 

4 

1 

4(- 

t-109) 

4 

83 

GO 

57 5- 62 5 

7 





79 


55 

52 5- 57.5 

7~ 

— 1 

T-T" 


7 

79 


SO 

47.5- 52.5 

4 

— 2 

— 8 


16 

77 


45 

42 5- 47.5 

1 

— 3 

— 3 


9 

76 


40 

37 5- 42.5 

1 

— 4 

— 4 


16 

76 


35 

32.5- 37 6 

2 

— 5 - 

-10(_: 

-32) 

50 

71 



N 

= 45 



77 

809 

71 









69 









64 

64 

C(35,5%) 

A.1VI.= 

W + .f 


SD. 

= s A 

jsjd^ 

^ N ' 

-cL 


64 

64 

02 

00 

60 

60 

69 

59 

58 

57 

57 

56 

56 

55 

55 

53 

52 

52 

51 

51 

47 

41 

37 

37 


77 

= 60 + 5.5! 




Fd(4 5%) 


= 60 + 5(1.71). 

= 60 + 8 55. 

= 68 55. 

Find score Inn its; 

68 55 + 1(19.40) = 78.25 upper limit of C group. 
68.55 + 15(19.40) = 97 05 upper limit of B group. 
68,55—i(19 40)=5S85 upper limit of D gioup 
68.55 —11(19.40) =39 45 upper limit of F group. 
A= above 97 65. 

B = 78.25 lo 97.65. 

0 = 58.85 to 78.25. 


=W^-(iw= 

= 5 + 17.98 — 2 92 


= 5 + 15 06. 

= 5 X 3.88 or 19,40. 


D = 39 45 to 58 85. 
Fd= below 39 45. 


* For a coinjilelo explanation and dispussion of tho many probloma involved in objectify¬ 
ing U'lichcrB marks see liangs, C’ W, and Groene, H. A, “TGacliors’ Marks and tho Harking 
System," UniverRity of Iowa Extension Bnlleliii No, 244, Collepo of Education Series No 26, 
May 15. 1930. 
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below. Accoi’dingly, to the arithmetic mean of 68.55 add one-half 
of the standard deviation (one-half of 19.40). Thi.s gives a value of 
78 25, which becomes the upper limit of the group of scores which will 
be assigned grades of C. 

Step 4- Tlis upper limit of the group of scores to be assigned B’s 
is found by adding one and one-half standard deviation units to the 
arithmetic mean. Thus, 68.55 + 1.5 (19.40) = 97.65, which is the 
upper limit of the B group. 

Step 5. The upper limit of the D group is found by subtracting 
one-half of a standard deviation unit from the mean. 68.55 — 0.5 
(19.40) = 58 85. 

Step 6. The upper limit of the Fd group is obtained by subtract¬ 
ing one and one-half sigma units from the mean of the distribution. 
68.55 - 1.5 (19 40) = 39.45. 

Step 7. From these values the score limits of this distribution may 
be set up. Class grades may then be assigned as indicated to the 
scores within the limits specified. 


G HADES SconB Limits 

A 97.05 and above 

B 7S.25 to 97 65 

C 58.85 to 78 25 

D 39 45 to 5885 

Fd below 39.45 


It is readily apparent that practically no subjective factors are 
involved in the assignment of grades by this method. The objective 
test scores of the 45 pupils used in the illustration are changed by this 
treatment into 5 A’s, 8 B’s, 16 C’s, 14 D’s, and 2 Fd’s. The score 
limits arc determined by the standard deviation units and would be 
the same no matter who assigned the grades It should be noted, how¬ 
ever, that these limits hold only for this particxdar distribution and 
must not be assimed to be true for any other test. The teacher should 
also remember that this method of grading docs not take into account 
the absolute level of ability at which a particular class works. The 
superior pupil in an average or poor class receives an A by this 
method just as readily as does the superior pupil in a very superior 
class. This is probably less serious than it sounds, however, for most 
class groups large enough to warrant the application of this technique 
average out quite well in this respect. 

Basis for T-Scores and Other Derived Scores. The standard devi¬ 
ation also affords the basis for derivation of a number of useful 
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derived scores in test interpretation. For example, the T-score now 
commonly used in commensurating test scores depends upon the 
standard deviation for its basic unit. For many years prior to the 
development and popularization of the T-score, test scores were ex¬ 
pressed in terms of their position in the total distribution. For in¬ 
stance, a puiiil’s score might be a member of a distribution having a 
mean of 60 points and standard deviation of 5 points. A score of 
50 in the test would be designated by a standard score of — 2.0 sigma 
units, since it lies exactly two standard deviation units below the 
mean of the distribution. This same procedure is used in assigning 

T-score.s. The formula for the T-.score is T = — -+ 50, in 

IT 

which m is the mean of the distribution, x the deviation of the score, 
and 0 - the standard deviation of the distribution. The difference be¬ 
tween the mcfin and any te.st score is multiplied by 10 m order to 
remove all decimal points. The 50 points are added in orilor that 
there may be no negative scores. The T-score is a very convenient 
method of interpreting test scores. T-scorcs of 25, 50, and 75 mean 
that the individual iiupil’s scores were right at the lower ciuartile, the 
median, and the niipcr quartiles. This fact makes it easy to attach 
lueamiig to the test scores. 

Scalmg oj Test Items. The standard deviation, along with certain 
other measures of variability, represents a convenient unit in which 
to evaluate the difficulty of test items. When used under these con¬ 
ditions the standard deination of a theoretically normal curve of the 
specified ability is used as the unit in laying off differences in diffi¬ 
culty along a linear scale. As a first step in the procednre, the per¬ 
centage of pupils failing on each item or exercise must be secured. 
By means of tables based upon the normal curve these percentages 
of failure are changed into standard deviation units which express the 
position of tlic exercises with respect to the mean ability of an infinite 
and normal population. Exercises which are answered successfully by 
50 per cent of the class arc assigned a position at the mean. Exercises 
missed by 65 or 60 per cent of the class are given sigma values above 
the mean, etc. A significant feature of this procedure, however, is 
the fact that a difference in difficulty of 5 per cent near the mean 
results in a relatively small sigma difference, while a 5 per cent dif¬ 
ference near the extremes of the distribution makes a relatively large 
sigma difference. This is in conformity with the fact that because 
of the height of the curve near the mean a smaller distance along 
the linear scale on the base line is required to add a given area of 
the curve. Thus, the difference in the sigma values assigned to two 
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test items having percentages of failure of 55 and 60 is 0.13 
standard deviation unit (2,74-2.61),“ while the difference in apparent 
difficulty of two items failed by 90 and 95 per cent of an experimental 
group is 0.34 standard deviation unit (4 09-3 75). The net result 
of this method of item evaluation is to magnify somewhat the sim¬ 
plicity of the very easy items and the difficulty of the very hard ones. 

Sigma units are also utilized in the construction of scales for the 
estimation of the merit or quality of certain classroom products. The 
use of these units in the derivation of such scales is discussed in detail 
in Chapter XII of this book. 

IV. MEASURES OP RELATIONSHIP 

The critical analysis and interpretation of educational test results 
often make it necessary for the teacher and research student to secure 
more exacting descriptions of the situation than arc afforded by the 
simple measures of central tendency and variability. For example, 
tlie matter of the selection of the test itself is one which cannot be 
determined wholly on the basis of the arithmetic means and the 
standard deviations. The most useful information fur this purpose 
is found by determining the relationship which exists between the 
ability to be measured and tlie tests or measures under consideration. 
In the construction and use of informal objective examinations there 
arc occasions when it is necessary to discover exactly how accurately 
the examination measures, and how much this accuracy would be 
improved by increasing the length of the examination. This type of 
analysis also requires the use of the method of correlation, the method 
wliich permits the determination of relationships among different 
measures of the same individuals. 

117. The Correlation Coefficient. 

In the statistical expression of relationships, as in the other meas¬ 
ures, it is desirable that this relation between two series of variablca 
be expressed in a single objective or mathematical value. A number 
of different ways have been proposed for the derivation of these ex¬ 
pressions of relationship, but no one of them has produced a term 
which is both objective and easily interpreted, Mcthod.s of comput¬ 
ing relationships in terms of the correspondence between rank posi¬ 
tions of scores, and in terms of the percentage of the scores falling 
within a specified unit of variability of each other, have been pro¬ 
's See Table 5 on page 392 of Rugg’s Slalislical Melhads Applied to Education 
(Houghton. Miffln). or any similar table of sigma values. 
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posed. In general these procedures lack sufficient exactness to war¬ 
rant their common use in the analysis of test results. The student 
who is interested in these different methods of revealing relationships 
will find them discussed in certain of the treatments on statistical 
methods listed in the references at the end of this chapter. In this 
discussion, attention is given exclusively to the Pearson product- 
moment method, which is not only the basic method but also the one 
most commonly used in educational statistics. 

The index expressing the degree of relationship between two series 
of measures is called a cocjficmit oj correlation. The coefficient re¬ 
sulting from the application of the Pearson product-moment method 
is designated as r. The iiossihle values of r range from perfect posi¬ 
tive relationships (+ 1.0) through all the possible decimal values 
through zero to — 1.0 indicating a perfect negative relationship. An 
r of zero is taken to mean that no relationship exists between the 
measures or that it is entirely due to chance. Positive relationships 
may be expected between such factors as barometric readings and at¬ 
mospheric pressure, or between each of such factors as native capacity, 
initiative, effort, concentration, interest, and school accomplishment 
in a given field. Negative relationships are usually found to exist 
within a given school grade between the chronological age of the 
pupils and their achievement scores on a reliable achievement test. 
Many low or zero relationships are found in educational data, but the 
best illustration of this tyjie of correlation is one in which pure chance 
operates. If two packs of numbered cards are shuffled carefully and 
cards are drawn from each pack and paired, the resulting relationship 
is due to pure chance, and the coefficient of correlation (r) approaches 
zero. If the same packs of cards arc rearranged so that the numbers 
appear in ascending order in each pack and cards are drawn off the 
top of each and paired as before, the r obtained should be positive and 
very high. If one of the packs is inverted and the cards arc drawn 
as before, the result should be a high negative correlation. 

The Pearson product-moment method of computing correlations, 
while involving a large number of rather complicated calculations, 
really calls for very few skills that the student has not encountered 
previously in this work. This coefficient of correlation (r) is a single 
numerical value which expresses the tendency of corresponding pairs 
of measures of two fields to deviate similarly from their respective 
means. Modern methods of work permit the computation of this 
coefficient from data arranged in frequency tables of the double-entry 
type. 

The double-entry or correlation table permits the simultaneous 
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tabulation of two measures of the same individuals. The class inter¬ 
vals are set up exactly the same as in preparing a simple frequency 
table. In fact, it is merely an overlapping table with two sets of class 
intervals, one reading upward along the left-hand margin and one 
reading to the right along the top. Such a double-entry tabulation 
is shown m Table 46, which also serves to illustrate the specific steps 
involved m computing the correlation coefficient. 

The data in Table 46 represent the very real problem of determin¬ 
ing the reliability of an experimental test by finding the correlation 
of the scores made on one-half of the test with the scores made on 
the other half. Let us assume that Pupil A made scores of 25 and 
29 on the two halves of the test The position of the score of 25 on 
the first half is found in the scale for that part of the test. This is 
in the step with a mid-pomt of 24. We then move across the table 
horizontally until the proper space is found for the score of 29 on the 
second half of the test. This is in the step with a mid-point of 30. 
A single tabulation mark in that space identifies both scores and at the 
same time indicates something of their relation to each other. In 
such a table a tendency for the frequencies to group themselves along 
the diagonal of the table itself is an indication of a positive relation¬ 
ship. Scores which deviate from the diagonal reduce the relationships. 
Figure 22 indicates something of the types of relationship which may 



Mn Mn Mn 

r High Positive r High and Negative r Zero or Indifferent 

Fio. 22.—Types ol Relationship. 


be expected from characteristic groupings of data. This interpretation 
can be made only when the scales of the tables read upward and to 
the right. 

118. Computation of Pearson Coefficient of Correlation. 

The Pearson product-moment coefficient of correlation is found by 
the solution of the following formula: 


r = 


tTajCTy 
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in which N is the nimiher of cases in the distribution, ua, the standard 
deviation of the distribution on the .-c-axis, ay the standard deviation 
on the y-iixia, Cj. the correction on the a:-axis, c„ the correction on the 
Si/-axis, and xy the summation of the products of the deviations of 
each measure from the means of the distributions. 

The following steps are involved in the solution of the formula 
as applied to the data given in Table 46; 

Stej) 1. The data are already tabulated in a double-entry or corre¬ 
lation table, so the first step in the work is to total the frequencies on 
each axis and cross-check the totals. 

8tcp 2. Assume suitable means for each axis of the table and lay 
off steps above and below the zero step. Compute the corrections 
on the a;-axis and on the i/-axis exactly as m finding the arithmetic 
mean and the standard deviation. In this work the Cj, is + 0.307 and 
the Cy is — 0.109. 

Step 3. Multiply the fd column by the d column and summate 
the resulting /d^’s separately for each distribution. Complete the cal¬ 
culation of the standard deviation for each distribution. Leave the 
resulting sigmas in terms of stops. This will save an extra multipli¬ 
cation and unnecessary division later in the work. 

Step 4- Find the sum of the xy products. This constitutes the 
only absolutely now step in the process up to this point. In the 
product-moment method each score or frequency is weighted in pro¬ 
portion to the distance it lies away from the means of the distribu¬ 
tions. Thus, in the example, the single individual score which lies in 
the extreme upper right-hand section of the table deviates a distance 
of + 6 steps from the mean of the y-axis and + 7 steps from the mean 
of the a’-axis. The combined moment of this single case is found 
by multiplying it by the product of 6 and 7(1X + 6X4-7 = 42). 
This case therefore has a moment of 42. The three cases at the in¬ 
tersection of the steps with mid-points of 33 have a combined product 
of 90 (+5X + 6X3 = 90). All cases in the upper right-hand and 
lower left-hand quadrants have a positive moment, owing to the alge¬ 
braic law of signs in multiplication. All frequencies lying in the 
upper left-hand or the lower right-hand quadrants of the table have 
negative product-moments since the product of a plus step-deviation 
by a minus step-deviation results in a negative sign. In this work the 
xy products are summated algebraically in a column along the right- 
hand side of the table. The total of the xy products is 1780. 

Step 5. The numerator of the fractional equation representing the 
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correlation coefficient is — r^Cy. Tins riuantity is found by divid¬ 
ing 1780 by the number of eases in the distribution and subtracting 
(algebraically) the product of the corrections for the two distributions. 
The result of dividing 1780 by 414 is 4 2995. The Cj,c„ product is 
— 0.0335. Since this Cj,.Cy product is negative (owing to the negative 
sign of one of the corrections), the net result is the addition of these 
two quantities. The numerator of the fractional part of the formula 
now becomes 4.33. 

Step e. The denominator of the fractional part of the formula 
comprises the product of the two standard deviations. The prod¬ 
uct for this correlation table is 4.995. 

(Step 7. The correlation coefficient (r) is the ratio of the two values 
found in steps G and 7. The r of this distribution is therefore +.867, 
which means that the relationship is positive and significantly high. 

119. Meaning of the Correlation Coefficient. 

It was suggested earlier m this discussion that the interpretation 
of the correlation cocfSciont is probably much more difficult than its 
computation. There arc a few deAuccs which arc helpful to the in¬ 
experienced worker, but, in general, assurance in the intcrjiretation of 
these measures comes only through experience. The suggestions given 
in this section may be useful during the period when this experience 
is being gained. 

One of the important outcomes of the use of correlation methods 
is that ■within certain limits it makes possible the e.stimating of un¬ 
known values from known values. The accuracy of this estimate, 
however, depends directly upon the correlation between the factors 
measured. For example, if it is known from previous experience that 
there is a high positive relationship between achievement in a specific 
manual arts subject and the students’ response to certain manual dex¬ 
terity tests, the probable achievement of a group of prospective 
students in this manual arts course may be determined within limits 
by securing their response to the manual dexterity test. A correlation 
coefficient of + 1.0 for the two factors would mean that an estimate 
of accomplishment based on the one factor would be approximately 
lOO per cent correct. As the amount of the correlation decreases the 
accuracy of the forecast declines, but not in a direct manner. A corre¬ 
lation of + 1.0 may mean 100 per cent accuracy in the estimate based 
on the rchationship, but a correlation of +.50 does not mean at all 
tliat the estimate based on it -svill be 50 per cent correct. A glance 
at the accompanying table will demonstrate this interesting fact about 
the correlation coefficient. 
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The percentages of forcasbng 
accuracy for different values of r 
given in Table 47 are obtained by 
applying Kelley’s proposed formula 
for the coeffi cient 0 / alienation 
(fc ='\/l “ ?■“) and then deducting 
the resulting values expressed as per¬ 
centages from 100 per cent. If esti¬ 
mates of one variable are to be made 
from measuremeiiLs of aiiother re¬ 
lated variable, this table will prove 
to be a useful safeguard. 

The following illustrations and 
practical interpretations of typical 
correlation coefficients representa¬ 
tive of the sort obtained from educa¬ 
tional data have been gleaned from 
a number of sources. They are offered 
here for whatever guidance they may 
give to the student or teacher inter¬ 
ested in this type of test analysis. 


TABLE 47 

Peiicentaci; of Foueca.sting Accuhacy 

FOH Sl’ECIUC V VLUES OF T 


Coofficipnt 

of 

Correlation 

1.00 

.99 

98 

95 

.90 

SC6 

SO 

.75 

.70 

iC5 

.60 

'.50 

.40 

.30 

.20 

.10 


Percentuge of 
roiccastmg 
Efficiency 
100 
86 
80 
69 
56 
50 
40 
34 
29 
24 
20 
13 
S 
5 
2 
Vi 


r value Educational Situation 


Interpretation 


-f- .96 Relation of two forms of a long, Evidence of unusually high ic- 
aiiulytical reading test for high- liability of mensurement; treat 
school students scores with confidence. 


-p .80 Repetition of same form of a 
gioup test of mental ability after 
a lapse of one semester. 

-p .65 A composite of three sepaiate es¬ 
timates by same teacher of the 
ability of a class of 35 students to 
respond to an objective test in 
industrial arts. 


Evidence of a marked relation¬ 
ship; considerable piognostic 
power even after lapse of a 
long interval. 

Evidence of some relationship 
but of limited use for prog¬ 
nostic purposes. 


-p .50 Scores on a good group mtelli- A very slight relationship of 
gence test and the class grades of no practical value for forecast- 
a class in industrial arts. ing puiposes (only 13 per cent 

effective). 


— .24 Chronological ages of pupils in a Negative relationship of an in¬ 
given giade and achievement different sort. Merely shows a 
scores on an objective test veiy slight tendency for young¬ 

er pupils in a grade to achieve 
at a higher level than the 
average. 
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120. Practical Uses of Correlation. Method. 

The teacher of indirstrial arts or tlic student of measurement in 
this field will unquestionably find the greatest use for correlation 
techniques in connection with the analysis of objective tests. The 
validity of a test may be expressed statistically in terms of the corre¬ 
lation of the instrument with some other criterion, such as another 
measure of known validity. The determination of the reliability of a 
test is almost entirely a matter of correlation method. Mastery of 
these uses of the correlation techniques will make the teacher a better 
critic of comiiiercial .standardized materials ns well as a more inde- 
pnmlcnt and efficient builder and critic of teacher-made tests for class¬ 
room use. Such mastery can come only from extensive practice on 
liroblems calling for the use of tlie.«c skills.' Students who are inter¬ 
ested in the theory of correlation or in the use of correlation methods 
in more critical research with tests are referred to the many excellent 
textbooks on statistical methods now available. 


TABLE 48 


V. ASSIGNMENT OF RELATIVE AND ABSOLUTE RANKS 
121. Relative Ranks. 

Acliicvomeiit as expressed in test scores can have meaning only 
when it is possible to consider it in relation to some other, known 
level of achievement. In many cases, as when informal objective ex¬ 
aminations are used, no definite standards or norms of achievement 
are available, Some simple method of comparing the accomplishment 
of each pupil in relation to the other individuals in the class is essen¬ 
tial. The process of assigning relative ranks 
to pupils’ scores in terms of their size is one 
way of doing this. This is accomplished by 
assigning to the individual making the highest 
score the first position in the class, the pupil 
making the second highest score the second 
position, etc. The assignment of such relative 
ranks is quite simple where the individual 
pupils make different scores, or where no tied 
scores appear. The illustration given in Table 
48 shows how all such tied scores are treated 
ill the assignment of relative ranks. 

Pupil A, with a score of 15 points, is assigned first position. Pupil 
B, with a score of 12, is given second position. Pupil C, with a score 
of 11, is given third place. Pupils D and E, both with scores of 10, 

' See Problems 22, 23, 24, 25, 39, and 40 in Greene’s Worh-Book .in Educa¬ 
tional Measurements (Longmans). 


Pupil 

Score 

Rank 

A 

15 

1 

B 

12 

2 

C 

11 

3 

D 

10 

4.5 

E 

10 

4.5 

E 

9 

7 

G 

9 

7 

H 

9 

7 

I 

8 

9 

J 

7 

10 
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would normally be assigned fourth or fifth places, but since it is im¬ 
possible to assign fourth or fifth place to one rather than the other, 
tied or average rank is assigned to each. In this instance 4.5 position 
is given to each of the pupils making a score of 10 points. Pupils F, 
G, and H are also tied with scores of 9 points each, but since they 
would regularly be assigned sixth, seventh, and eighth positions they 
are each given the average of these positions, or seventh place. The 
practice, therefore, in all cases of tied scores is to assign average rank 
to the tied scores. When the number of tied scores is even, the ranks 
assigned will lie mid-way between the ranks which would ordinarily 
be assigned to two middle seores. When the number of tied scores 
is odd, the position assigned to all is the position which would nor¬ 
mally fall to the middle score. In general, the position assigned to 
the pupil with the lowest score will agree with the number of cases in 
the series except when the last scores are tied. 

122. Absolute Ranks. 

The practice of assigning relative positions to pupils on the basis of 
their test scores, though aiding in the interpretation in some ways, 
actually covers up something of the actual situation. As a matter 
of fact, the assignment of relative ranks covers up the true differences 
in the size of scores. In Table 48 the difference of three score points 
between Pupils A and B is indicated by only a single position in rank 
the same as is given to the difference of one point for Pupils B and C. 
Thus, relative ranks reveal that a pupil is above or below another 
in achievement, but they do not indicate in any way the magnitude of 
that difference. Relative ranks also take no account of the actual 
achievement level at which the accomplishment takes place. A pupil 
having a rank of 18 in a class of 20 would be considered as having 
a low ranking in his group. However, if he were found to rank 
eighteenth among 400 similar individuals his position would indicate 
a significantly different type of achievement. Percentile ranks, as one 
form of absolute ranking, take this factor into account by reducing all 
ranks to a basis of 100 units. A percentile rank of 100 means that 
the individual making the specified score achieves at a level high 
enough to exceed 100 per cent of a similar group without regard to 
the number in the group. In a similar way, a percentile score of 75 
means that the individual made a score such that it exceeds that of 
75 per cent of the individuals of lus group without respect to number. 

Percentile scores are easily computed from frequency tables and 
are very useful in comparing the achievement of pupils taking an in¬ 
formal or non-standardized test. Percentile scores are also used very 
widely in the interpretation of standard test scores at the secondary- 
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school and college level. The student will recognize the seventy-fifth 
percentile as a measure with which he has already had some contact. 
This percentile is the same as the third or upper quartile (Qa). It is 
found by exactly the same methods as are used in finding the median 
except that 75 per cent of the cases in the distribution are counted 
out below the point on the scale assigned as the seventy-fifth per¬ 
centile. The same general methods are applied in the determination 
of the twenty-fifth, fiftieth, or any other designated percentile. The 
tenth, twentieth, thirtieth, and fortieth percentiles, etc, are known as 
the deciles. These are very often used in test interpretation. 

The computation of the commonly used percentile scores is illus¬ 
trated in Table 49. Since all the processes involved here have been 
used in earlier work, the computation is presented without comment. 


TABLE 49 

CoMrXITATION OP Peucentile ScOREa 




Percentile 


Test 

Class Interval 

i 

Score 

Interpretation 

Score 

162 5-167.5 

1 

100 

Score equaled or excelled by no 

167 

157.5-102.5 

1 


student. 


152,5-167 5 

0 

00 

Score equaled or excelled by 10 

142 

147 5-152,5 

3 


per cent of students 


142.5-147.5 

2 

80 

Score equaled or excelled by 20 

135 

137,5-142,5 

5 


per cent of students. 


132 5-137,5 

4 

75 

Third quartile—score equaled or 

128 

127,5-132.5 

2 


excelled by 25 per cent of stu¬ 


122 5-127.5 

6 


dents 


117,5-122,6 

3 

70 

Score equaled or excelled by 30 

124 

112 6-117 5 

0 


per cent of students. 


107.5-112,5 

11 

60 

Score equaled or excelled by 40 

116 

102,5-107.5 

8 


per cent of students. 


07.5-102.5 

8 

50 

Median—score equaled or ex¬ 

112 

92 5- 97.5 

4 


celled by 50 per cent of students. 


87.5- 92 5 

1 

40 

Score equaled or excelled by 60 

109 

S2S- 87.5 

2 


per cent of students. 


77 5- 825 

0 

30 

Score equaled or excelled by 70 

105 

72 5- 77 5 

1 


per cent of students. 


67.5- 72.5 

0 

25 

First quartile—score equaled or 

103 

62.6- 675 

0 


excelled by 75 per cent of stu¬ 


57 5- 62.5 

1 


dents. 



N = 7\ 

20 

Score equaled or excelled by 80 
per cent of students. 

101 



10 

Score equaled or excelled by 90 
per cent of students. 

95 



0 

Score equaled or excelled by 
practically all students. 

58 
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T1i 6 interpretation of percentile scores frecjuently gives some 
trouble to the worker inexperienced in their use. Fig. 23 is a graphic 
presentation of the percentile scores given in Table 49. This figure 
shows the characteristic curve (ogive) resulting from the use of per¬ 



centile scores. The heavy solid line in the figures represents the re¬ 
sults of an arbitrary smoothing of these percentile scores. This smooth¬ 
ing process is frequently used when percentile scores are based on 
fairly large populations and are set up as tentative norms for the 
interpretation of the tests. 

VI. SUMMARY 

This chapter presents a non-technical discussion of a few of the 
common statistical tools which teachers of industrial arts will find 
useful in the analysis and interpretation of educational test results. 
Discussions of four of the six major statistical techniques outlined in 
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the introductory paragraph of tliis chapter are presented. The funda¬ 
mental imiiciplcs of the grouping and tabulating of test scores are 
stated and illustrated. The need for measures of dispcr,sion is shown. 
The quartile deviation and the standard deviation are explained in 
some detail; and practical applications of these measures to problems 
of test analysis arc made. Thu general meaning and the methods of 
correlation arc given, along with a few definite hints concerning the 
interpretation of correlation coelficicnts. The practical uses and mean¬ 
ings of the ranking of test scores on both the relative and the absolute 
basis arc discus.scd The two remaining problein.s, dealing with the, 
derivation and inteipretation of test norms and standards, and the 
use of simple graphic methods of presenting the results of statistical 
analysis, arc reserved for treatment m the following chapter. 

There has been no attempt to make this chapter a complete dis- 
cussinn of all the interesting or even useful statistical techniques. To 
do this would require a volume in itself. As a matter of fact, the 
brevity of the treatment makes it impossible to iiresent an adequate 
number of examples and illustrations to give the inexperienced worker 
sufficient experience with statistic.al problems. Real mastery of 
these skills can come only through repeated and continuous use. The 
student who is interested in achieving real skill and understanding in 
this field will wish to make extensive use of the selected references on 
page 224. 

EXERCISES IN SUMMARIZING RESULTS OF TESTING 
T.abul.vting Test Score.s 
Pioblem 1 

a. Arr.mgo or rank these scores from an objective examination m woodwoiUing 

in dosccnding order. 

95, 99, 40, 44, G8, 84, 51, 60, 91, 58, 00, 60, 72, 87, 77, 76, 65, 00, 70, 89, 77, 
SO, 78, 78, 02, 61, 04, 64, 54, 57, 58. 63, 93. 90, 70, 59, 02. 69, 70, 85, 72, 73, 
83, 71 

b. IVhat 1 .S the largest score made on this test? 

c What is the smallest score made on this test? 
d. What is the range of the scores? 

e If a frequency table with a step of 3 is made, how many steps will be re¬ 
quired? 

f. What will be the limits of the step required for the largest score? 

g. AA^hat will be the mid-pomt of this step? 

h. Make a frequency table of these scores using a 3-point step and mid-points 

divi.sible by the size of the step. Do your work on the left half of a sheet 
of piipor and preserve it for use in later problem work 

i. If your work is right, the frequencies reading from the top will be as 

follows: 

1, 1, 2, 2, 1, 3, 1, 4, 2, 3, 5, 4, 0, 2, 3. 2. 0, 0, 1, 0, 1 


N = 44 
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Computing tub Abitiimctic Me IN¬ 
I') o 6 It m 3 

Compute the anthmotir mean fiom the frcqueiicv tiibir pi'tqiared in Pioblem 1. 
(Answer = 70,4 fiorn frequency table with siep of S.) 


Computing the Mid-Me,vsurb ■ind the Mhiji.in 
Pioblem 3 

Coinputo the median from the frequency tidilc in Problem 1 
(Answer = 69 7.) 

Problem ,J 

Emd the mid-measure for the arores given in Part ii of Problem 1. 
(Answer = 70 0.) 

Computing Measuiie.s op Viuiihii.itv 
Pioblem Si 


Find the qiiiirtile deidalion for the ocoies tabulated in Problem 1. 

(Answer = 8.5 ) 

Problem 6 

Find the standard deviation of the scoics in thp table pioparod for Problem 1. 
(Answer = 13.3.) 

Computing Me.asures op Rehtiunship 
Pioblem 7 


The following paired scoies weic obtained by giving the same form of an ob¬ 
jective examination two times to the same pupils 



A 

61 

67 

B 

56 

60 

C 

73 

70 

D 

07 

70 

E 

53 

49 

F 

48 

52 

G 

43 

44 

H 

35 

31 

I 

23 

25 

J 

57 

56 

IC 

78 

81 

L 

71 

73 

M 

65 

67 


N 

52 

55 

0 

44 

49 

P 

40 

41 

Q 

33 

33 

R 

58 

59 

S 

70 

77 

T 

70 

75 

U 

03 

63 

V 

50 

SO 

w 

41 

46 

X 

30 

37 

Y 

34 

31 

Z 

25 

27 


a. Prepare a correlation table of these 26 pairs of scores. Use a 3-point step on 
both axes. Compute the coefficient of correlation as a basis for cxpre.ssing 
the reliability of the objective examination. (Answer = -|-.959.) 
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Comi'utinu Percentile R\nks 
Ptobkm S 

Use tlie ficqueney table for tlie first feat srorcs tabulated in Problem 7, and 
oompitte the percentile scores for each of the deciles as shown in Table 49, 
Check your own work fur accuracy, 
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CHAPTER XV 


INTERPRETING THE RESULTS OF TESTING 

1. THE RESULTS OF TESTING 
123. The Meaning of a Test Score. 

It is important in this chapter on the interpretation of the results 
of testing to define clearly what is meant by a test score. In order 
to accomplish this, two or three new concepts may require explanation. 
In the first place, a tost score is a numerical expression of perform¬ 
ance on the part of an individual. Sometimes the test score is merely 
the number of exercises responded to correctly. Sometimes it is an 
arbitrarily defined scale value. But whatever its form, its function 
is to reveal in a quantitative way the performance of an individual 
as he responds to stimuli given under certain conditions. This leads 
to the second concept involved in the meaning of a score. The test 
score is an evidence of performance. Performance, the response of the 
individual to the test situation, is the expression of ability operating 
under certain conditions. The pupil may make a poor score because 
he does not have the ability to do better—may not know the facts. 
On the other hand, he may make a low score because of certain phys¬ 
ical conditions: illness; discomfort; poor hearing, sight, or illumina¬ 
tion; a broken pencil; a dislike for the subject, the teacher, or exam¬ 
iner; a failure to give attention to and to comprehend the directions, 
etc. Any one of these or a dozen other factors may affect the score. 
Accordingly, there is the possibility and even likelihood of a serious 
error in the assumption that a test score is a direct evidence of ability. 
The conditions under which the performance takes place must be 
known before it is safe to infer ability from performance. 

Ability, as an abstract concept, may be defined as the power to do. 
Power to do, to respond to stimuli and to situations, is the product of 
training and experience. This suggests that, unless training and native 
capacity factors are known, inferences as to abilities may be mislead¬ 
ing. This point becomes particularly serious in the interpretation 
of mental-test results, for it is common practice for users of mental 
tests to infer innate capacity (mental ability) from performance 
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scores. The real serioiisne.ss of this type of uncritical inference may 
be seen by comparing the intcrjiretations of an achievcnient-tc.9t score 
and a tnentaHe.st score. Both are basically expressions of perform¬ 
ance. Equal abilities may be inferred from equal scores from both 
types of tests if and rvlicn all the conditions under which they are 
given are definitely under control. Although it is difficult to make 
sure that all physical and physiological factors arc adequately con¬ 
trolled ni a, testing situation, it is possible to regulate most of the 
mechanical conditions within reasonable limits. 

The significant point to note here, however, is the fact that users 
of achiovciiient tests stop with an inference of equality of ability from 
equal iierfonnauec scores, but users of menial tests arc obliged to take 
a further inference. In the intci'iiretation of mental-test results it is 
common practice to infer ccpial native capacity from apparent evi¬ 
dences of equal abilities. The fallacies in this argument and the 
dangers of this step must be readily apparent. Equal capacities may 
be injerred from performance scores only when there is direct and 
positive evidence of two things: first, that the conditions under which 
the testing took place were identical and equally well controlled; sec¬ 
ond, that the training opportunities of the individuals compared have 
been equal. The mechanics of testing now make it fairly easy to con¬ 
trol testing conditions. The second factor represents a real stumbling- 
block in the way of an accurate and sane interpretation of the mental- 
test results. The na’ive manner in which some makers and many users 
of mental tests assume equality of learning opportunity, and hence 
equal capacity from equal performance scores on mental tests, is one 
of the things which has made teachers and students skeptical of their 
value. 

It is possible that the foregoing discussion of the meaning of a test 
score may appear to indicate that it is imiiossiblc to give meaning-to 
any kind of a test score. Such is not the intention, even though the 
purpose here is to emphasize the need for a conservative attitude in 
test-score interpretation. In the long run, the more that is known 
about the variables underlying test scores, the more critical must the 
user become. The greatest damage that has been done to the field 
of educational measurements in the past has come as a direct result 
of carelessness and ignorance on the part of users of tests, and their 
tendency to draw unwarranted conclusions from the results. The in¬ 
dustrial arts teacher should be able critically to select suitable tests 
and scales for classroom and shop use, control the mechanical con¬ 
ditions of their adrainustration, and draw sane and defensible conclu¬ 
sions and inferences from the results. 
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124. Giving Meaning to Informal Test Scores. 

The UKCi- of industrial education test.s in the classroom is con¬ 
fronted with two tyjies of test data fur interpretation. The first type, 
and undoubtedly the more common of the twm, deals with the results 
of informal, teacher-made tests. The results from these home-made 
tests m turn are of two types: the subjective scores assigned by teach¬ 
ers to pupils’ resjionscs to essay-type tests, and the performance scores 
resulting from informal objective examinations Although something 
can be done to improve the interpretation of the relatively unreliable 
marks assigned to the discussioii-typc exercise, the performance scores 
resulting from reasonably long and reliable objective examinations are 
much more important measures of achievement, and as such deserve 
complete and accurate interpretation. The second type of educational 
test data requiring interpretation arises, of course, from the results of 
using standard tests. Since one of the major functions of the standard¬ 
ization of a test is the establishment of meaning for the test scores, 
many more types of interpretation arc possible for data of this type. 
Purely for convenience in the organization of this discussion, problems 
of the intei'iu'etatum of standard test scores are considered first. 


II. NORMS AND STANDARDS 

125. The Meaning of Standardization. 

Early in the history of objective testing in the classroom prac¬ 
tically all that was required for development of a so-called standard¬ 
ized test was to give a few reasonably suitable test e.xcrcises to a 
hundred or more pupils in different school systems. These results 
were then compiled and submitted as nonns. In fact, for many years 
almost the only real difference between a standardized test and a rea¬ 
sonably good informal objective test w'as the fact that the former 
had been tried out in a larger number of different classes. Test 
standardization as it is now interpreted means much more than the 
mere derivation of norms, although the existence of norms is still one 
of the chief distinctive features of the standard tests. There has been 
much improvement in both the informal test and the more formal 
standardized test. 

In terms of present-day test-construction practices the standard¬ 
ization of a test involves a long period of experimentation with a large 
body of subject-matter exercises. After the subject-matter field to be 
tested has been decided upon, there is the very difficult problem of 
selecting the more important areas of this field to be sampled. Many 
times experimental evidence must be secured before it is possible to 
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decido upon the he.st type of test exercise to use. Even then, many of 
the exercises prepared in preliiuinary form for the test are found to bo 
badly stated or to be totally unsiiited for the type of test to be con¬ 
structed. Usually from four to six times as many exercises must be 
prepared in the preliminary work with a test as will appear in the 
test itself when in final and standardized form. Special care must be 
taken to see that a suitable range of difficulty is provided in the items, 
and that multiple items covering certain of the more important skills 
are prepared in parallel form so that these items may be adequately 
sampled in the several forms of the test which must be prepared. 
After the exercises themselves have been written m preliminary form 
they must be tried out under cxperiincntal conditions in typical class¬ 
rooms for the purpose of discovering the faulty or ambiguous items 
and for the additional purpose of discovering the relative difficulty of 
the several items. From the results of this iireliminary use of the 
exercises two or more roughly scaled forms of the tests may be set up 
for further experimental use. From the results of this second trial it is 
usually possible to equate the forms of the tests ciuitc closely by shift¬ 
ing hard and easy items from one form to another until approximate 
equality is reached. Then the tests are ready for a further trial in 
a large number of representative classes for the purpose of further 
equating the forms and establishing norms. It is thus apparent that 
while standardization is only one of the final steps in the preparation 
of a carefully made test, it is this extensive sampling of the results of 
the use of the test in many classrooms which affords the basis for the 
assignment of meaning to the test scores, 

126. Meaning of Norms. 

Many of our present-day standardized tests began their existence 
as informal objective examinations. In fact, many informal exam¬ 
inations of the objective type meet all other criteria of standard tests 
except that they do not have norms for the evaluation of their scores. 
Standardized tests are characterized by the fact that they are com¬ 
monly accompanied by norms representative of the type of accom¬ 
plishment which may be expected from classes similar to those used 
in the standardization program. Norms thus furnish the necessary 
information for the interpretation of the test scores and for the eval¬ 
uation of achievement of a class. They are obtained by giving the 
particular test to a large and representative sampling of pupils in the 
same grades and of a type similar to the group which the teacher 
wishes to test. To the extent that the sampling is distributed over a 
large population in typical school situations and the conditions under 
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which the tests are to be administered arc rigidly followed, the norms 
furnish a reliable and useful basis for interpretation. 

127. Standards and Norms. 

The use of the term standardized in the discussion of tests of the 
type for which norms are provided has led to the development of a 
careless tendency to treat the words “standards” and “norms” as being 
synonymous, The process of securing tlie data for the critical analysis 
of tests and the derivation of suitable norms is properly known as 
standardizing. However, the term standards when used to refer to 
levels of pupil achievement, implies an ultimate goal to he achieved. 
Standaids may not actually be reached by any individual, but they 
are levels of achievement toward which to strive. Norms are the levels 
of achievement which tppicnl pupils actually attain. It is clear that, 
in the light of these definitions, few tests are accompanied by 
standards. 

128. Specific Uses of Test Norms. 

Although the general function of test norms is to provide a basis 
for the interpretation of test scores, several specific uses should be 
pointed out at this time. For example, test norms give meaning to 
the test score There is no way of determining except through com¬ 
parison with the norm for a test whether a given score is high or low. 
To be explicit, is a score of 96 points on the Newkirk-Stoddard Home 
Mechanics Test a high, low, or average score for a pupil to make at the 
end of the year’s course in general shop work? A reference to the 
norms given in Table 50 will give an answer to this question. As a 
matter of fact, such a score is so good that only 25 pupils in a hun¬ 
dred may be expected to do better than that. 

Norms point out to both pupil and teacher the actual levels or goals 
of achievement which both should attempt to attain. That is, the 
norms tell all parties concerned how far they have to go and approxi¬ 
mately when they have arrived. Norms provide almost the only ob¬ 
jective basis for the analysis of individual pupil weaknesses. Certain 
of the better achievement tests in special subject fields are made up 
of a number of different test-parts designed to measure distinct 
aspects of achievement in the subject. Many of these tests are pro¬ 
vided with separate norms for the special parts of the tests making it 
possible to reveal the pupil’s standing in each of the independent test- 
paids. (See Table 51.) 

Norms for achievement tests used in connection with results from 
mental tests make it possible to determine within practical limits 
whether the pupil is working up to the real ability he possesses. The 
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TABLE 50 

liNu-nr-YiLVu Noniiri roll NBWKmK.-STO[)nARu Home Mechanics Test 

Point Scores 



Form A 

Form B 


. 74.7 

74.6 

Median . .... 

. 77.9 

73 0 

Q:\ ... 

. 95 8 

96 3 

G. . 

. . 00 5 

54 5 

S. D. 

25 1 

27.9 


TABLE 51 

Si'IIIlES (IN lOIVA SlI.ENT REAIIING TesT . ELEMENTARY, RY A NiNTH-Gh ADE PuPIL 


Score 


Test 


Part Test 


1. Paragrapli mcauinp; 

A. Science . 

. . . . 18 

40 

B Hihtoiy . 

, . 22 


2 Word nieaidrig 

A. Geiicial vocabulary . 

,. . . 24 

42 

B. Subject-iuallcr vocabulary . 

18 


3 So lection of cential idea of iiaragraph . . 

,. . . 10 

10 

4. Seutence ineamnfi . 

28 

28 

5. Location of mfoimation 

A. Alphabetizing , . .... 

3 

12 

B. Use of thu index . . . 

.. .. 9 


Total comprehension score. . 


132 

6 . Hate of silent reading , . 


27 


basis for thi.s type of analj'sis of accompliBlnncnt is found in the com¬ 
parison between the mcnttil a))ility of tlic pupil expressed m terms of 
Ins mental age and his educational achievement as represented by his 
educational age. 

129. Kinds of Norms. 

The kind of norm which accompanies a test depends to a large 
degi-ee upon the level in the school system at which the test is used 
The norm is also conditioned somewhat by the nature of the test itself. 
Tests which are designed for use in the elementary-school grades are 
usually accompanied by two types of norms, grade norms, anS age 
norms. Tests intended for use in the secondary-school grades are usu¬ 
ally provided with semester and grade norms only. Age norms do not 
seem to be particularly useful in the upper grade levels, because so 
many factors other than age operate to affect achievement. Then, too, 
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the curve of mental growth flattens out very raj)iflly in tlic upper grade 
levels BO that the incrcmenlB of growth in achievement from age to 
age at the upper levels are not sigmfieant. In iilacc of the age norms 
for secondary-school and college te^ts, the eommon practice today is to 
provide quite detailed tables of percentile equivalents for the point 
scores. 

In the low'er grades, age as ■well as grade norms are usually pro¬ 
vided with achievement tests. In general, the type of norm is deter¬ 
mined by the method of grouping tlie scores when thc^ tabulation for 
the norms is made. If the piiinls are grouped by grades, without re¬ 
spect to age or school jirogre.Bs, the resulting miriiis are grade norms. 
If the pupils are classified in accordance with some specific age-scale 
as the basis for the tabulation, the resulting iioniis arc age norms. 
In the derivation of grade norms for standard tests it is dc.sirable to 
have the norms clearly indicate the period they are designed to cover. 


III. RESULTS DERIVED FROM NORMS 
130. Grade Levels. 

Test scores accompanied by a fairly reliable set of grade norms 
can be expressed in terms of the relative position of these scores with 
respect to these grade norms. In fact, this is one of the very simple 
and convenient methods of changing test scorc.s into a form which 
even the child or his parents can understand. The fact that an indi¬ 
vidual pupil is in the seventh grade or the eighth grade lias come to 
have some meaning to the average pupil or parent. An isolated test 
score cannot have this meaning. However, as soon as a test score 
is identified with a specific grade level of accomplishment it takes on 
a definite meaning. 


TABLE 52 

Giude Nokm.s for Hwochty Kemiinc Ex4min\tion: SinM.i 3 


Grade 

5 

G 

7 

8 

9 

10 

11 

12 

Score 

40 

54 

GS 

80 

93 

104 

112 

118 


The method of deriving the grade levels (so-called G-scores) from 
grade norms is illustrated from the revised grade norms for the 
Haggerty Reading Examination; Sigma 3. Table 52 shows the scores 
to be expected at the end of the year for each grade. From this table 
it is apparent that a score of 93 is the ninth-grade end-of-the-year 



232 


INTBEPRETING THE RESULTS OF TESTING 


norm; a score of 104 is the end of the year norm for the tenth grade, 
etc. Thus a student making a score of 104 points on this test may 
be described as achieving at a level equal to an average pupil at 
the end of the tenth grade This value may be simply expressed as 
10”. A pupil making a score of 93 points on this test may be as¬ 
signed a grade-level position of 9*", meaning that his achievement is 
comparable to that expected at the end of the ninth grade. Pupils 
making test scores between 93 and 104 may be assigned grade levels 
corresponding to the proportion of the distance between the end 
of the ninth (beginning tenth) grade and the end of the tenth 
(beginning eleventh) grade work the scores represent. To illustrate, 
the score-point distance from 93 to 104 is 11 points. A score of 95 
would therefore represent a grade-level distance which is two-elev¬ 
enths of the way past the beginning of the tenth grade. Tor prac¬ 
tical purposes this is two-tenths of a grade. Thus a score of 95 on 
this test corresponds roughly to a grade position of 10^. 

131. Age Scores. 

Since age equivalents as derived scores have been discussed in this 
chapter in connection with the meaning of norms, and since they are 
not considered by most test workers to be of very great significance 
in the junior-higli-scliool and secondary-school grades they are given 
no extended treatment here. 

132. Percentile Ranks. 

One of the favorite ways of interpreting test scores in the second¬ 
ary-school and college field is to use percentile ranks. Percentile 
ranks are of particular value in the interpretation of informal and non- 
standardized tests, since they permit the comparison of each indi¬ 
vidual in the group with the group of which he is part. In contrast 
with the method of assigning ranks by relative position, the cal¬ 
culation of the percentile rank expresses the absolute position of the 
individual pupil in his relation to the rest of his group. The calcu¬ 
lation of percentiles involves the division of the total distribution into 
100 equal parts, hence the term percentile. Achievement as repre¬ 
sented by a test score is expressed as a position in a population of 
lOO cases. A score representative of high achievement ranks high in 
the percentile scale and is excelled by only a small number of cases. 
For example, in Table 53, which shows the percentile norms for a new 
plane geometry aptitude test, a score of 72 points or more is assigned 
a percentile score of 100, meaning that such a score is so high that it 
almost certainly will not be excelled by any one. Table 53 presents 
the percentile norms for this test in a convenient form for transferring 
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each possible test score into percentile wiuivalcnts. Since percentile 
scores represent the position of the individual score in a distribution 
of infinite populationj they are very convenient devices for turning 
test scores from unlike scales into comparable measures. There are 
many occasions in the interpretation of educational-test results, par¬ 
ticularly in experimental situations, when this is very desirable, 

TABLE 53 

Iowa Plane Gbometry Aptitudb Test Percentoj'. Equivalents of Test Scores 


N — 413 (girls 199; boys 214) 


Score 

Percentile 

Score 

Percenlile 


Equivalents 


Equivalents 

72 or more 

100 

34 

55 

6S to 72 

99 

33 

53 

G3 to 64 

98 

32 

50 

60 

97 

31 

46 

59 

96 

30 

42 

58 

95 

20 

40 

57 

94 

28 

36 

55 to 56 

93 

27 

33 

54 

92 

26 

30 

53 

91 

25 

27 

51 to 52 

90 

24 

25 

50 

88 

23 

23 

49 

87 

22 

20 

48 

85 

21 

18 

47 

S3 

20 

16 

46 

81 

19 

14 

45 

80 

18 

12 

44 

78 

17 

10 

43 

76 

16 

9 

42 

75 

15 

8 

41 

73 

14 

7 

40 

70 

13 

6 

39 

68 

12 

5 

38 

66 

11 

4 

37 

63 

9 to 10 

3 

36 

60 

7 to 8 

2 

35 

58 

6 

1 



0 to 5 

0 


133. Intelligence Quotients. 

The discovery and use of the concept of mental age made possible 
the development of the quotient idea. In a general way, all quotients 
derived from results of measurements express the development of the 
individual as related to average expectancy for his age or mental level. 
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ScnrcH on mcniiil tests provide the basis for the derivation of mental 
ages. Scores from achievement tests, provided the tests are accom¬ 
panied by age norms, may be expressed as achievement or subject ages. 
The ratio between the mental age of an individual student and his 
chronological age is called an intelligence quotient. If an achieve¬ 
ment age is used, the resulting quotient is an educational quotient. 

The intelligence quotient (.LQ.) as found in practice is the result 
oj dividing the mental age (M.A.) of the individual by his chrono¬ 
logical age (C.A.), both expressed in months The result of this divi¬ 
sion is expressed as a whole number by multiplying the quotient by 
100. An illustration will make this procedure clear. Let us assume 
that a pupil who is twelve years and four months of age makes a 
score on a mental test which gives him a mental-age equivalent of 
eleven years and three months. At the outset, it is clear that since 
his mental age is less than his chronological age, he has not made 
quite normal development in mental ability. That is, his I.Q, will be 
somewhat less than 100. Actually the I.Q. of this individual is 


M.A. 
C. A. 


1 -ii; 

=4^x 100, or 91 
148 ’ 


An intelligence quotient of 100 indicates normal development on 
the part of the individual. A quotient of less than 100 means that 
there is more or less retardation in the development, and a quotient 
of above 100 means more or less accelerated development. It is com¬ 
mon practice for examiners in the psychological clinic to consider I.Q.’s 
of 90 to 110 as average or approximately normal. Quotients above 
110 are considered superior m proportion to the extent to which they 
exceed that value. Similarly, quotients of less than 90 are below 
average and inferior in proportion to the amount which they fall 
below that value. I.Q.’s of very high and very low levels are naturally 
relatively rare. All these interpretations of the intelligence quotient 
are of course dependent upon the reliability of the measuring instru¬ 
ment on which they are based. 

134, Educational Quotients. 

Many of the better achievement tests designed for use in the ele¬ 
mentary and junior-high-school grades are equipped with age norms 
which permit the expression of achievement scores as educational ages. 
These educational-age scores make it possible to derive an educa¬ 
tional quotient by following a procedure identical with that used in 
deriving the intelligence quotient. Since age norms have been found to 
be impractical for most of the educational achievement tests designed 
for high school, educational quotients have not been used very widely 
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in secondary-school measurement. Subjects such as make up the bulk 
of the industrial arts field do not lend themselves especially well to 
standardization on an age basis. Hence there is not very great like¬ 
lihood of using these educational quotients in iiieasurcnient in indus¬ 
trial education. The same comment appears to hold for the accom¬ 
plishment quotient (A.Q.), a ratio designed to indicate the relative 
degree to which an individual student is utilizing his capacity to 
achieve. The basic idea back of the accomplishment t|uotiont has re¬ 
ceived some consideration in an earlier section of this chapter. 

The various quotients and other measure.'! utilized in the interpre¬ 
tation of educational-test results can be made effective servants of the 
teacher only through extensive experience in their use. A complete 
control can be gained only through iiracliec in their calculation and 
interpretation. Mastery can be retained only through continued use.’ 

IV. RESULTS FROM INFORMAL OBJECTIVE TESTS 

135. Objectifying the Marking System. 

A critical examination of the marking sy.«tcm and the marks as¬ 
signed by teachers makes it very apparent that some radical improve¬ 
ments in these phases of educational measurement are needed. As a 
result of an extensive survey of the problem, and a study of the recom¬ 
mendations of educators who have studied the marking system, the 
following program for eliminating many of the uns.atisfaetory features 
of the present methods of assigning marks is submitted: 

1. Diftcard the practice of viarking pupils in percentages. Three 
reasons are advanced for this decision: (a) the percentage scale has 
for its only fixed points 0 and 100. The former means just no ability 
while the latter means perfect mastery. Yet the complete scale is 
practically never used in practice, (h) The cstablishincnt of the limits 
of the scale fixes the intermediate values. Accordingly, the difference 
between marks of 75 and 76 should be the same as the difference be¬ 
tween marks of 97 and 98. Common observation reveals the absurdity 
of this assumption, (c) The use of the percentage scale presupposes 
that the teacher is able to distinguish as many as 101 minute differ¬ 
ences in accomplishment. Experimental evidence “ reveals that teach¬ 
ers are able to distinguish from five to seven levels of ability. To use 

1 Extensive opportunity for practu-p in the derivation of grade equivalents, 
age scores, peiocntiles, and quotients of various types is provided in the IToi-fc- 
Book in Educational Measurements (Longmans). See particularly Problems 

30 to 33. T. 

2R,uch G M, The Objective or New-Type Examination, Scott, J?oresnian 

and Company, Chicago, 1920. pages 370-374. 
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a finer scale assumes an exactness of discrimination on the part of 
teachers which does not exist, (d) The use of an arbitrarily selected 
percentage as a passing mark as is very commonly done results in 
throwing the marks into a badly skewed distribution with too large a 
proportion of the marks piled up at or near the passing mark. 

2. Each mark assigned to a pupU should be a symbol designed to 
indicate his power to do. This symbol should be defined in exactly 
worded statements, understood alike by teachers, administrators, and 
pupils. 

The following definitions of letter grades by Hillbrand “ are given 
as an illustration of the type of statements that should be prepared 
by the teacher for the purpose of defining each of the letter steps in 
the five-point scale. 

Gb.\db Definition 

A 1.' ConsistenUy does more than is i-equiied. 

2. Has wide vocabulary at his command. 

3. Is always alert, takes active part in discussions. 

4. Has unusual dependability m taking assignments. 

5 la prompt, neat, and thorough in all work, and unusually free from 
teachers’ correction 

6. Knows how to select books, tools, materials, and is a rapid worker. 

7. Has initiative and originality in attaclung problems. 

8. Has ability to associate and rethink the problem and can adapt him¬ 
self to new and changing situations. 

9. Has enthusiasm for and interest in his work. 

10. Has ability to apply ideas gained in study to everyday life. 

B 1. Frequently does more than is required. 

2. Has good vocabulaiy and speaks with conviction. 

3. Unusually alive to the situation at hand. 

4. Careful in complying with assignment. 

5. Eager attack on new problems; profits from criticism. 

6. Prompt, neat, thorough, and unusually accurate m all work. 

7. Has ability to apply general principles of the course. 

C 1. Does what is required. 

2. Possesses a moderate vocabulary. 

3. 'Willing to apply himself during class hour 

4. Does daily preparation with comparative fieedom from carelessness. 

5. Attentive to assignments. 

6. Has ability and willingness to comply with instructions and a cheer- 
fill response to correction. 

7. Reasonably thorough and prompt in all work. 

8. Has average neatness and accuracy in all work, 

9. Has ability to retain collectively the general principles of the 
course. 


^School and Society, Vol. 21:142, January 31, 1925. 
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Grade Dcpinition 

D 1 Usually does what is required. 

2 Attendance often in-Gguliir. 

3. Tools and equipment sometimes lacking. 

4. Frequently "misundeistands” assignment. 

5. Willing but slow in complying with instructions and corrections. 

6. Cnieless in preparation of assignments. 

7. L.acking in thoroughness and sometimes taidy with work. 

8. Careless in presentation of work. 

Fd 1 Usually does a little less than is required. 

2. Listless and inattentive in class. 

3. Tools and equipment for work often lacking. 

4 Always tardy with work. 

5. Seldom knows anything outside the lesson, 

6. Retains only fragments of the general principles of the course. 

7. Lacking in qualities of the first llireo groups to the extent that he 
cannot or will not do the work. 

3. Each teacher should give objective examinations or quizzes 
frequently throughout the term, and the scores from these tests should 
afford the major basis for his marks. Prior to the assignment of 
marks for a school period or semester the pupils should be ranked on 
their test scores and these scores should then be transformed into 
marks on a five-point letter scale by the use of the standard deviation 
technique in large sections or classes (thirty or more). In small 
classes (less than thirty) this may be accomplished somewhat more 
simply by dividing the distribution of scores into five groups and as¬ 
signing the designated mark to previously determined percentages of 
the class. 

The letter grades used and the typical percentages of the class as¬ 
signed each grade under these conditions are as follows; 


Letter Grades 

A 

B 

C 

D 

Fd 

Percentage of class . 

. 4-G 

19-21 

4S-52 

19-21 

4-C 


The essential steps in the assignment of grades by the standard 
deviation method are outlined in Chapter XIV. The actual solution 
of a problem utilizing this method in the assignment of marks to ob¬ 
jective-test scores from a class of forty-five pupils is shown in 
Table 45, page 208. 

4. Require teachers to prepare in advance for each six-weeks pe¬ 
riod carefully worded statements of the objectives of each subject for 
that period. Unless this is done, no one can determine whether or not 
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the pupils are being tested on the points on which they should be 
tested. Tliis statement of objectives should be the criterion by which 
the validity of the objective tests is determined. 

5. TT^orA; -preiiarcd jor daily assignments should be treated as a 
requirement oj the course, but marks assigned should be determined by 
numerous bnej objective quizzes or tests upon the work assigned. 

6. Notebook and laboratory work should he treated as a require¬ 
ment oj the course, and credit should bo deducted or withheld jor work 
which IS unsatisfactory or incomplete. However, the marks assigned 
should be determined by frequent objective tests on the work rather 
than on the basis of the notebook or laboratory work which may or 
may not be the pupil’s own work. 

7. Assign marks on “accomplishment” or “performance” rather 
than on indefinite subjective factors such as effort, attitude, ability, 
etc. 

8. Final grades summarizing all the quiz and test grades for the 
course can be obtained quite readily by assigning point values to each 
letter grade, computing the actual average for each pupil, and then 
re-assigning the class marks on the basis of these averages. This is a 
very simple way of assigning final grades for fairly large groups and 
m courses in which a relatively large number of objective marks are 
to be summarized in the final grade. It also permits the application 
of a definite schedule of weighting for certain period and final tests 
ill accordance with the teacher’s judgment of their importance. 

The accompanying table of point-values (Table 64) corresponding 
to specific letter grades may be useful to the teacher. Values are 
suggested for plus and minus values of the letter grades as one means 
of softening some of the shock from the arbitrariness of letter grades 
assigned on the basis of the normal curve. Students whose test scores 
fall just below the point where a superior grade is given sometimes 
feel that this is a distinct element of unfairness in the system. As¬ 
signing the plus and minus letter grades to their quiz scores serves to 
take care of this problem quite adequately. 

V. SUMMARY 

This chapter deals with the practical steps in the analysis of test 
results which make it possible for the classroom teacher to utilize and 
profit from these results. 

The acceptance of the notion that a test score is merely a numeri¬ 
cal oxprcs.sinn of performance which, subject to the conditions operat¬ 
ing at the time, reveals the ability of the individual is essential to a 
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TABLE 51 


SURGESTED PoiNT VALUES CORRE- 
SI’ONDING TO LlTTEH GiIAHES 


Guido 

Points 

A + 

IG 

A 

15 

A — 

14 

B + 

12 

B 

11 

B — 

10 

c + 

S 

c 

7 

c— 

6 

D + 

4 

D 

3 

D — 

2 


Fd 0 


safe and sane interpretation of the meaning of test scores. A recog¬ 
nition of the inferences involved in the interpretation of tests of gen¬ 
eral or mental ability will do much to protect against the over-inter¬ 
pretation of the results of such tests. 

The meanings of the terms standardization, standards, and norms 
are clearly brought out and illustrated in this chapter. The various 
types of derived scores likely to be useful to the teacher of industrial 
education are discussed. 

Since the maior use of tests by the classroom teacher is in the 
evaluation of achievement, the importance of the informal objective 
test and other teacher-made measures is emphasized. Present ten¬ 
dencies are distinctly in the direction of the more systematic use of 
such instruments as the most important single basis for the assignment 
of teachers’ marks. This practical aspect of measurement is so im¬ 
portant that considerable attention is given in this chapter to the 
discussion of possible methods of improving the marking system. After 
all, the marking system is the one phase of educational measurement 
with which practically every teacher comes into close contact. 
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EXERCISES IN INTERPRETING RESULTS OF TESTING 

1. Elaboiiilc your interpretation of the basic concepts underlying the meaning 

of a test score. 

2. Show by illustration the real differences between test norms and test standards, 

3. By refeiring to Table 52, compute the giade levels (G-scores) corresponding to 

score.s of 7-1 and 100. 

4 Find the intelligence quotients (I.Q) for two individuals each 12 years 5 
months ivlio-se mental ages aie 11 yeais 9 months, and 13 years 11 months, 
reapeotivoly 

5. Criticize the recommendations given for the objectification of the marking 

aysl,!>m given on jiages 235 to 238 

6. In your opinion do the definitions of the meaning of the various letter grades 

have any place in a progi'iiiii for the objectification of teachers' marks? 

7. Show how you would averiige the following letter grades to secure a term 

final grade, aasuming all grades to count the same e.vcept the final examina¬ 
tion guide which is allotted triple weight. What final grade would you 
assign? 

First test — B 
Second test — C 
Thiid test — B 
Fourth test — A 

8. Using the standaid deviation technique as illustrated in Table 45, assign letter 

grades to the objective-test scores secured fiom the second test given in 
Problem 7, Chapter XIV. 


Fifth test — A 

Sixth test — C 

Final examination — B 
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This appendix contains two types of niatci'ial supplementary to the 
discussions and illustrations in the main body of this volume. The 
list of publishers and distributors of tests should be useful m making 
a contact with additional types of test materials likely to interest 
the industrial education teacher. The glossary of terms used in the 
discussion will help to clarify the meaning of some of the more tech¬ 
nical expressions. 


APPENDIX A 

PRINCIPAL DISTRIBUTORS AND PUBLISHERS OF TESTS OF 
INTEREST TO INDUSTRIAL EDUCATION TEACHERS 
AND SUPERVISORS 

This Appendix presents a selected list of distributors and pub¬ 
lishers of test material likely to be of interest to industrial education 
students, teachers, and supervisors. Obviously this list does not in¬ 
clude many of the important distributors and publishers of tests of 
more general interest. 


Bruce Publishing Company, Milwaukee, Wiscon-sin. 

Bureau of Educational Research and Service, University of Iowa, Iowa City, 
Iowa. 

Educational Test Bureau, Minneapolis, Minnesota. 

Ginn and Company, Boston, Massachusctls. 

Gregoiy Company, The C, A., Cmcmnati, Ohio, 

Houghton Mifflin Company, Boston, Massachusetts. 

Manual Ai'ts Press, Peoria, Illinois. 

Marietta Apparatus Company, Marietta, Ohio. 

Public School Pubhshmg Company, Bloomington, Illinois. 

Scott, Poresman and Company, Chicago, Illinois. 

Smith, Turner E , Atlanta, Georgia 

Stanford University Press, Stanford University, California. 

Stoclting Company, C. H., Chicago, Illinois. 

Teachers College Bureau of Publications, Columbia University, New York. 
World Book Company, Yonkers, New York. 
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APPENDIX B 
GLOSSARY 

This glossary is appended for the convenience of the student or 
teacher who may find that many of the terms used m this treatment 
are outside of his experience. 

ability. Power to produce; the icsult of school tunning and environment oper- 
ating on capacity. 

accomplishment. Used syiioiiymoiisly with achieveincut or pioduction. 
age norms. Typical pciforinancc of subjects grouped by age groups. Usually 
oxpres.sed as the average of actual performance of subjects of different ago 
gruiijis. 

age scores. The age equivalents assigned to given point scores on tests pro- 
nded with age norms. 

alternate response. Used in desciibing any objective Lest exercise m which the 
subject must choose between two possible answers, one of which is right 
and the other of which is wrong. See true-false 
ambiguity. A lack of clearness oi definiteness m the statement of a fact or a 
tost item. 

analytical test. A test which, by taking cross-scctions of abilities related to 
total accomplashmont in a subject, furnishes a basis for an analysis of the 
iindeilymg skills but docs not necessarily reveal their interrelationships or 
causes of wcalcness. 

aptitude. Predisposition for successful achievement in a given field. . 
arithmetic mean. A measure of central tendency commonly called the average, 
array. A collection of data usually arranged around a particular function 
assumed mean. The mid-point of the class interval taken ns the zero point in 
laying off deviations in computing the arithmetic mean from a frequency 
distribution. 

capacity. Power to learn or to iirofit from training. 

character traits. Qualities of the individual such as mentality, honesty, morality, 
sense of humor, sympathy, etc., which make up personality, 
chronological age. The life age of an individual. 

classification of pupils. The placement of pupils in a school system in groups 
by grades or ages for more economical instruction, 
coefficient of contingency. A ineasnire of relationship used in the critical analysis 
of test items. Based upon a comparison of the frequency of cases found in 
each category with the fiequoncy which we should expect to find if the 
traits were eomplotely umelated. 

composite score. A single value used to express the results obtained from a 
number of different incaanres. 

comprehension. The degree of understanding of an exercise or material read, 
conditions. Factors camsing variations in testing or experimental situations, 
correction. A remedy or adjustment. Also m a technical sense in connection 
with the computation of the arithmetic mean. 
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correction for chance. In alternate- or niultiple-rosjionse te.sts tliere is a cer¬ 
tain opportunity for guessing to enter. The correction for chance is the 
adjustmont for guessing. 

corrective. Used synonymously with remedial. Implies the remedying of ob¬ 
served defects or difficulties. 

correlation. The relation between two or more senes of ineasures of the same 
individuals or items. 

criterion. The standard by which the validity of measurement may bo de¬ 
termined. 

diagnosis. Exact identification and location of strengtli.s and wcakne.sses. 
diagnostic test. A test sufficiently reliable and detailed m content to identify 
and reveal individual pupil weaknr.'i.-'cs. 

difficulty. When used in icfeicnce to test items it imphe.s a large pereentage of 
incorrect responses. 

discrimination. The quality in a test or test item which enables it to tlis- 
tmguish adequately between varying levels of ability, 
educational guidance. A program designed, to direct pupils into school activities 
in which they are likely to .sneccod and hud most profit, and away from 
fields in which difficulties and failures are almost certain to bo oncountercd 
by the child. 

error of grouping. A variable error entering into the tabulation of data m fre¬ 
quency distributions. Brought about tliiough the practice of placing to¬ 
gether in class inteiwala measures which may be widely unlilce. 
error of sampling. The result of using a too limited number of cases as being 
typical of a large group, 
essay-type test. See traditional examination. 

exercise. A unit of work in a test governed by a specific sot of directions. 

Sometimes used m the sense of a stimulus for drill, 
extraversiou. The process of being interested in and .stiimilatcd by perron.s and 
things outside of oneself. 

fore exercise. A preliminary or practice exercise for the purpose of giving the 
pupil experience with the specific test situation, 
form. Used to distinguish between two or moie closely equnaicnt nrrunge- 
ments of similar but not identical test items, 
frequency. The number of measures in a given interval or tabulation. Ere- 
quently indicated by the symbol /. 

frequency table. A distribution showing the muiiber of men.surcs ap.signed to 
successive class inteiwals. 

fulcrum. The axis upon which a lever is supported and rotated, 
general ability. Same as general intelligence. A test of general capacity. Con¬ 
trasted with achievement. 

grade equivalent. The grade or fraction of a grade nearest which a pupil's test 
score places him when compared with the grade norms for the test, 
grouping. The process of classifying data into certain categories, 
group teat. A test designed for administration to a number of individuals at 
the same time. 

home mechanics, A term used to describe manual tasks arising from the main¬ 
tenance and repair of household articles and equipment, 
individual differences. Observed or measured unlikenesses in pupils in capacity, 
ability, etc. 
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informal teat. A tcacher-ni.'icle instrument aa contrasted with n standardized 
test. 

intelligence. The power to learn, or to piofit from training. 

I.Q. Intelligence quotient. An index expressing relative brightness as the ratio 
of mental age to chronological age. 

interpolation. The process of loiaiLing nn intermediate point between two known 
points in accordance with the operation of laws conditioning the case at 
hand. 

interpretation. The explanation of results and the application of same to a con¬ 
crete situation. 

interval. Used interchangeably with step in preparing a frequency table, 
introversion. Tlie process of having one’s interests turned in oneself, 
manipulative teats. Peiforiuance tests in which the subject turns out an objec¬ 
tive product as a result of jilaiming and tool opeiation. 
matching-type test. A type of le.st item in which the stimulus and response 
forms are presented in parallel columns for convenience in recording the 
identification. 

mean. The arithmetic mean. The point on a scale of values about which the 
deviation.s are least. 

median. A common mcasuie of central tendency. See definition in the text, 
mental ability. The power to leam. 

mental age. The mental ability of an mdividual expressed in terms of the age 
of an iiveriige individual having that ability, 
mid-point. The exact middle of a step in a frequency table, 
multiple-choice teat. A type of objective test made up of exercises arranged m 
such a way that the subject must select one or more correct responses from 
a group of possible responses. 

N. A symbol used to indicate the number of cases in an army, 
negative correlation. A iclationship m which large values of the one variable 
are always accompanied by small values in the other, 
normal. Typical; making regular progress or development, 
norms. Rcprcaentations of the typical or average performance of subjects of 
different age or grade groups. Usually based on a large number of cases, 
objective. A term used in describing tests in which no opportunity for dis- 
agreciiicut as to coriecluess of lesponse exists, 
objectives. Used in the discussion of curriculum construction as synonymous 
with outcomes. 

percentile. The points which divide the total number of cases in a given fre¬ 
quency distiibution into 100 equal parts. 

performance. Achievement. Also used to distinguish test scoics as such from 
ability or capacity. 

personality inventory. A personal rating device from the results of which certain 
personality characteristics are revealed. 

power tests. Tests which express achievement in terms of the difficulty of the 
task which the subject is just able to perform 
practice effect. Increase in a test score due to previous experience with the test, 
product scale. A measuring device listing variable characteristics on which judg¬ 
ments are to be based. 

prognostic test. A test designed to predict probable future achievement on the 
basis of present performance. 
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quality scale. A device which iiioa.sui0.1 by coiupiivison witli a set of standard 
specimens the icsnlt of applying some .spei-iflc skill, 
quartiles. The rc.sult of dividing a distribulinn into quarters 
random sampling, A .solection of eases on a purely chance basis, 
range. Scale cliffeionee between exlrenies of an array, 
rank. Position assigned to a score in a .series. 

rate test. A device whieh iiiea.snies aehiovement in terms of the number of 
tasks of uniform diffieully which ean be performed in a speeifieri time, 
rating scales. Mnasiiring devices wliieh set up levels of tpialilies or pioduct.s 
for the guidaiiee of judges in evahialmg .such cpuililies or proiluel.s in tlie 
classroom or shop 

recall test. A te.st or exercise whieh rails for the subject to state the answer 
rather than to rerognize it among several pos.sil)l(! respcinsi'.s. 
recognition teat. A le.sL 01 exercise in which the student mciely identifies the 
correct form of response fiom several jios.sibiliti<‘S. 
relative rank. Position as.signed to scores in a limited ariay. 
reliability. A technical expression of the consistency ivilli whieh a nieasuring 
mstruraent performs. 

reliability coefficient. An index found by the pioccss of coriclatinii indicating 
tho relation which may be expected between successive admini.strations of 
the sarao measuring instruineut. 

remedial. Material and devices which me designed to coiroet existing weaknesses 
111 learning or mastery. 

retarded. Used to inijily school progress or mental or cdueal.iomd development 
which is slower tlian is expected of llie noimul .subject, 
scores. A description of the performance of a subject, 
sigma. Synouyinoiia with sUmdard deviation 
standard deviation. A common riioasuic of variability of .scorc.s. 
standardization. Tho process of lofining a test and sotting up objective goals of 
performance. 

standards. Ultimatn goals of achievement. Mistakenl.v nsed synonymously with 
nouns winch imply actual levels of accomplishment, 
subjectivity. The degiec to which measurement ic.sult.s aio affecled by per.'jonal 
factors or judgments. 

survey tests. Tests whieh have for their main pmpo.sc.s the ineasuicnient of 
abilities in terms of broad general functions 
tabulation. The process of das.sifying data in tables for eonden.satioii and inter¬ 
pretation. 

teacher’s marks. The personal evalution of tho inipil’s aecomjili.shmcnt in a 
specific field of activity assigned by the elassioom or shop teacher, 
technique. Skill in executing tool or machine operations, 

test. Any type of measuring device by which a numerical expre.>;sion of the 
pupil’s performance is secured. 

traditional examination. Examinations or to.sts of the non-objective or discus¬ 
sion type. 

training. The learning opportunity alTurded through school, sliop, or other life 
contacts. 

true-false test. A recognition-type test in which tho individual i.s called upon 
to determine tlie truth or falsity of items 
T-scores. A derived test score based on the standard deviation unit. 
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unit of measurement. The quantity or quality used as the basis for expressing 
differences. 

validity. A term used to express the degree to which a measuring instrument 
measures the thing it purports to measure, 
vocational guidance. A program designed to direct individuals into vocational 
activities for which they are suited and away from activities for which they 
are not suited. 

zero point of a scale. The point of origin of the instrument. 
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